<-- HTML Generated by MacWeb on 12Oct94 (at 22:53:26) --> Computer Generated Random Numbers

Computer Generated Random Numbers

David W. Deley

Copyright 1991


                   COMPUTER GENERATED RANDOM NUMBERS
                                  by
                            David W. Deley
                                 c1991



SYNOPSIS

This paper presents a number of techniques for analyzing a computer
generated sequence of random numbers.  Some background theory in
probability theory and inferential statistics is presented, and a
number of empirical tests are presented along with example programs
which apply the tests to several popular random number generators
including the VAX Run-Time Library routine MTH$RANDOM used by the
FORTRAN intrinsic function RAN and by the VAX BASIC function RND, the
standard ANSI C rand() function used by VAX C, and the Data Encryption
Standard (DES) algorithm.  For comparison the sample programs also
apply the tests to the obsolete RANDU random number generator.


CONTENTS:

   1.0  INTRODUCTION
        1.1  MTH$RANDOM USAGE
        1.2  RANDU USAGE


   2.0  PROBABILITY THEORY
        2.1  ROLLING DIE
        2.2  THE LANGUAGE OF PROBABILITY


   3.0  INFERENTIAL STATISTICS
        3.1  THE PROBABILITY OF ERROR
        3.2  GENERAL TEST PROCEDURE
        3.3  THE CHI-SQUARE TEST
        3.4  THE KOLMOGOROV-SMIRNOV TEST


   4.0  ANALYSIS OF A SEQUENCE OF RANDOM NUMBERS
        4.1  ONE-DIMENSIONAL TESTS
        4.2  TWO-DIMENSIONAL TESTS
        4.3  THREE-DIMENSIONAL TESTS
        4.4  HIGHER DIMENSIONAL TESTS


   5.0  LINEAR CONGRUENTIAL RANDOM NUMBER GENERATORS
        5.1  THE MTH$RANDOM ALGORITHM
        5.2  THE RANDU ALGORITHM
        5.3  STARTING THE MTH$RANDOM GENERATOR
        5.4  STARTING THE RANDU GENERATOR
        5.5  STARTING THE ANSI C GENERATOR


   6.0  RATINGS FOR VARIOUS GENERATORS
        6.1  REVIEW OF TESTS PERFORMED
        6.2  TEST RESULTS
             6.2.1  MTH$RANDOM
             6.2.2  RANDU
             6.2.3  C and ANSI C
             6.2.4  Microsoft C
             6.2.5  Turbo Pascal


   7.0  THE SPECTRAL TEST


   8.0  SHUFFLING


   9.0  DES AS A RANDOM NUMBER GENERATOR


  10.0  HOW GOOD A RANDOM GENERATOR DO YOU NEED?


  11.0  WHAT IS A RANDOM NUMBER GENERATOR?


  12.0  CONCLUSION


  REFERENCES

---------------------------------------------------------------------------



1.0  INTRODUCTION

The VMS Run-Time Library provides a random number generator routine
called MTH$RANDOM.  On the first call you supply an initial SEED value
to the routine, and it returns a random deviate uniformly distributed
between [0,1).  Call the routine again and it returns another random
deviate uniformly distributed between [0,1).  Calling the routine over
and over again produces a sequence of random deviates all supposedly
uniformly distributed between [0,1).  Our question now is, is this
sequence of numbers really random?

Although there is nothing "random" about a completely deterministic
computer generated sequence of numbers, we can analyze the sequence of
numbers to see if there is any reason to doubt that the sequence could
have come from a truly random stochastic process.


1.1 MTH$RANDOM USAGE

The MTH$RANDOM routine is called by the FORTRAN intrinsic function RAN.
>From FORTRAN it is called as follows:

      INTEGER SEED
      R = RAN(SEED)



1.2 RANDU USAGE

The VMS FORTRAN Run-Time Library also contains a second random number
generator which implements the infamous RANDU random number generator
first introduced by IBM IN 1963.  This turned out to be a poor random
number generator, but nonetheless it has been widely spread.  If you
should ever come across it on some other machine, don't use it.  VMS
FORTRAN abandoned the RANDU generator in 1978 in favor of the MTH$RANDOM
routine.  The RANDU routine is considered obsolete and is now completely
undocumented, but it still exists in the VMS FORTRAN Run-Time Library
and will always exist to support any old programs which still use it.

>From FORTRAN the obsolete RANDU random number generator can be called in
any of three ways:
      INTEGER*4 SEED
      INTEGER*2 W(2)
      EQUIVALENCE( SEED, W(1) )

      R = FOR$IRAN(     W(1), W(2)    )       !Preferred way
      CALL FOR$RANDU(   W(1), W(2), R )       !Alternate way
      CALL FOR$RANDU_W( W(1), W(2), R )       !Alternate way

R is the return value between [0,1).  W(1) and W(2) together is the
seed value for the generator.  This goes back to the PDP-11 days of 16
bit integers.  SEED is really a 32 bit integer, but it was represented
as two 16 bit integers.

We will first cover some background theory in probability theory, and
then we will test the FORTRAN RAN and RANDU generators.  Hopefully we
will find out why the infamous RANDU generator is considered bad.

---------------------------------------------------------------------------
2.0  ROLLING DIE

Consider the case of rolling a single die.  A theoretician picks up the
die, examines it, and makes the following statement:  "The die has six
sides, each side is equally likely to turn up, therefore the
probability of any one particular side turning up is 1 out of 6 or 1/6.
Furthermore, each throw of the die is completely independent of all
previous throws."  With that the theoretician puts the die down and
leaves the room without ever having thrown it once.

An experimentalist, never content to accept what a cocky theoretician
states as fact, picks up the die and starts to roll it.  He rolls the
die a number of times, and carefully records the results of each throw.

Let us first dispel the myth that there is such a thing as an ideal
sequence of random numbers.  If the experimentalist threw the die 6
times and got 6 "one"s in a row, he would be very upset at the die.  If
the same experimentalist threw the same die a billion times in a row and
never got 6 "one"s in a row, he would again be very upset at the die.
The fact is a truly random sequence will exhibit local nonrandomness.
Therefore, a subsequence of a sequence of random numbers is not
necessarily random.  We are forced to conclude that no one sequence of
"random" numbers can be adequate for every application.



2.1  THE LANGUAGE OF PROBABILITY

Before we can analyze the resulting data the theoretician collected, we
must first review the language of probability theory and the underlying
mathematics.  (Unfortunately, the limitations of my character cell
terminal prevent me from directly typing many of the mathematical
symbols needed in this discussion.  So I'll make due as best I can and
hope the reader can follow me.)

A single throw of the die is called a "chance experiment" and is
designated by the capital letter E.  An outcome of a chance experiment
E is designated by the Greek letter ZETA.  If there is more than one
possible outcome of the chance experiment, (as there always will be,
else the analysis becomes trivial), the different possible outcomes
of the chance experiment are designated by ZETA_1, ZETA_2,...
The set of all possible outcomes of a chance experiment is called the
"Universal Set" or "Sample Space", and is denoted by the capital
letter S.

For the die throwing experiment E, we have:

Chance experiment:
     E:  one throw of the die

Possible outcomes:

     ZETA_1 =     .         (one dot)


               .
     ZETA_2 =               (two dots)
                     .

                     .
     ZETA_3 =     .         (three dots)
               .

               .     .
     ZETA_4 =               (four dots)
               .     .

               .     .
     ZETA_5 =     .         (five dots)
               .     .

               .  .  .
     ZETA_6 =               (six dots)
               .  .  .


Sample Space (or Universal Set):

     S = {ZETA_1, ZETA_2, ZETA_3, ZETA_4, ZETA_5, ZETA_6}


We now define the concept of a "random variable".  A random variable is
a function which maps the elements of the sample space S to points on
the real number line RR.  This is how we convert the outcomes of a
chance experiment E to numerical values.  The random variable function
is denoted by a capital X.  An actual value a random variable X takes
on is denoted by a lowercase x.

A natural random variable function X to define for the die rolling
experiment is:

     X( ZETA_1 ) = 1.0
     X( ZETA_2 ) = 2.0
     X( ZETA_3 ) = 3.0
     X( ZETA_4 ) = 4.0
     X( ZETA_5 ) = 5.0
     X( ZETA_6 ) = 6.0

Note that the faces of a standard die are marked with this random
variable function X in mind.  The number of dots showing on the top
face of the die corresponds to the value the function X takes on for
that particular outcome of the experiment.

We now consider the generation of a random sequence of numbers
by repeating a chance experiment E a number of times.  This can be
thought of as a single n-fold compound experiment, which can be
designated by:

                         E x E x . . . x E = En
                             (N times)

The sample space or universal set of all possible outcomes for our
compound experiment En is the Cartesian product of the sample spaces
for each individual experiment E:

                    Sn = S x S x . . . x S
                             (N times)

Any particular result of a compound experiment En will be an ordered
set of elements from set S.  For example, if E is the die throwing
experiment, then the elements of set Sn for rolling the die N times are:

       Sn = { ZETA_1, ZETA_1, ZETA_1, ZETA_1, ZETA_1, ZETA_1 }
            { ZETA_1, ZETA_1, ZETA_1, ZETA_1, ZETA_1, ZETA_2 }
            { ZETA_1, ZETA_1, ZETA_1, ZETA_1, ZETA_1, ZETA_3 }
               .       .       .       .       .       .
               .       .       .       .       .       .
            { ZETA_6, ZETA_6, ZETA_6, ZETA_6, ZETA_6, ZETA_6 }

The total number of elements in Sn is equal to the number of elements
in S raised to the Nth power.  For the die throwing experiment, there
are 6 possible outcomes for the first throw, 6 possible outcomes for
the second throw, etc.  so the total number of possible outcomes for
the compound experiment En is:

                  6 * 6 * . . . * 6 = 6**N  = |Sn|
                       (N times)

The probability of any particular element in Sn is equal to the product
of the probabilities of the corresponding elements from S which
together comprise the element in Sn.  If each element of S is equally
likely, then each element of Sn is equally likely.  Which translated
means if each possible outcome of experiment E is equally likely, then
each possible outcome of experiment En is equally likely.  For our die
throwing experiment, each of the 6 possible outcomes for throwing the
die once is equally likely, so each of the 6**N possible outcomes for
throwing the die N times is equally likely.

We can now take our random variable function X and apply it to each of
the N outcomes from our compound experiment En.  This is how we convert
the outcome of experiment En into an ordered set of N random variables:
                ( X1, X2, ..., Xn )

In this way each specific element of our sample space Sn can be
transformed into a set of n ordered real values.  We will use a
lowercase letter x to denote specific values that a random variable
function X takes on.  So the results of a specific compound experiment
En when transformed into specific real values would look something
like this:

                ( x1, x2, ..., xn )

For example, if we rolled a die N times we might get the result:

                ( 3, 5, 2, 6, ..., 4 )
                      (N times)

Then again, if we rolled a die N times we might get the result
consisting of all ones:

                ( 1, 1, 1, 1, ..., 1 )
                      (N times)

The first result looks somewhat like what we would expect a sequence of
random numbers to be.  The second result consisting of all ones however
tends to make us very unhappy.  But, and here is the crux, EACH
PARTICULAR OUTCOME IS EQUALLY LIKELY TO OCCUR!  Each outcome
corresponds to a particular element in Sn, and each element in Sn has
an equal probability of occurring.  We are forced to conclude that ANY
sequence of random numbers is as equally likely to occur as any other.
But for some reason, certain sequences make us very unhappy, while
other sequences do not.

We are forced to conclude, therefore, that our idea of a good sequence
of random numbers is more than just anything you might get by
repeatedly performing a random experiment.  We wish to exclude certain
elements from the sample space Sn as being poor choices for a sequence
of random numbers.  We accept only a subspace of Sn as being good
candidates for a sequence of random numbers.  But what is the
difference between a good sequence of random numbers and a bad sequence
of random numbers?

For our die rolling experiment, if we believe the die to be fair, and
roll the die N times, then we know all possible sequences of N outcomes
are equally likely to occur.  However, only a subset of those possible
outcomes will make us believe the die to be fair.  For example, the
sequence consisting of all ones {1,1,...,1} is as equally likely to
occur as any other sequence, but this sequence tends not to reinforce
our assumption that the die is fair.  A good sequence of random numbers
is one which makes us believe the process which created them is random.

---------------------------------------------------------------------------



3.0  INFERENTIAL STATISTICS

Up until now we have been dealing with probability theory.  We started
with a chance experiment in which the probability of each result is
given, and then worked our way forward to find out what the probability
of a particular compound experiment would be.  We now turn to the field
of inferential statistics, where we attempt to do the reverse.  Given a
sequence of numbers created by N trials of a chance experiment, we
attempt to work our way backwards and estimate what the defining
properties of the chance experiment are.



3.1  THE PROBABILITY OF ERROR

For our die throwing experiment, we would like to estimate the
probability of each outcome based on experimental data.  If we throw
the die N times, then the best estimate we could make for the
probability of a particular outcome ZETA would be to count the number
of times ZETA occurred in N trials and form the ratio:

              number of times ZETA occurred
              ------------------------------
                 total number of trials

If we define a frequency function f(ZETA) to be the number of times
ZETA occurred in N trials, we then would have:

                         f(ZETA)
             Pr(ZETA) =  -------
                            N

This is our estimate of the probability of ZETA based on N trials of the
chance experiment.  Note that some people use this equation as the
definition for the probability of an event ZETA when N goes to infinity:

                        _                 f(ZETA)
               Pr(ZETA) =  lim            -------
                           N -> infinity     N


For our die throwing experiment, assume we want to estimate the
probability of ZETA_1, (the side with one dot shows up).  Say we throw
the die 6 times, and out of those 6 times ZETA_1 occurred twice.  We
then calculate the probability of ZETA_1 as 2/6.

We now have the problem of deciding which value to accept as correct,
and how much faith we can place on our choice.  Let us define two
possible hypotheses, which we will denote as H0 and H1.
     H0 : The probability of ZETA_1 is 1/6
     H1 : The probability of ZETA_1 is 2/6

In choosing which hypothesis to believe, there are two possible
errors we could make, known as a type I error and a type II error:

     Type  I error : Choose H1 when H0 is actually true.
     Type II error : Choose H0 when H1 is actually true.

We can determine the probability of making each type of error by
examining the theoretical probability distributions for f(ZETA)
for each hypothesis being tested.

We rolled the die six times, so the possible number of times ZETA_1 can
occur is between 0 and 6 times.   Let f(ZETA_1) be the number of times
ZETA_1 occurred in six rolls of the die.  The following binomial
formula will calculate the probability of having ZETA_1 occur k times
in n trials:

                                / n \
       Pr( f(ZETA_1) = k )  =  |     | * p**k * q**(n - k)
                                \ k /
       Where:
           Pr( f(ZETA_1) = k ) = probability of exactly k occurrences
                                 of ZETA_1 in n trials

           f(ZETA_1) = number of occurrences of ZETA_1 in n trials

           n = number of trials

           p = Pr(ZETA_1)    (probability of ZETA_1)

           q = 1 - p

            / n \          n!
           |     |  =  -----------    (n things taken k at a time)
            \ k /      (n - k)! n!


With n = 6 trials, and assuming hypothesis H0 to be true (the
probability of throwing ZETA_1 is 1/6), we obtain the following table:

       f(ZETA_1)     Pr ( f(ZETA_1) | H0 )   (sideways bar graph)
      -----------   ----------------------
          0               0.334897           |================
          1               0.401877           |====================
          2               0.200938           |==========
          3               0.053583           |==
          4               0.008037           |
          5               0.000643           |
          6               0.000021           |

As you can see from the numbers, the highest probability is for
throwing one ZETA_1, although there is still a high probability for
throwing zero or two ZETA_1's.  If we repeat the binomial calculations
this time assuming hypothesis H1 is true (the probability of ZETA_1 is
2/6), we arrive at the following slightly different table:

       f(ZETA_1)     Pr ( f(ZETA_1) | H1 )   (sideways bar graph)
      -----------   ----------------------
          0               0.087791           |====
          1               0.263374           |=============
          2               0.329218           |================
          3               0.219478           |==========
          4               0.082304           |====
          5               0.016460           |=
          6               0.001371           |

We can now calculate the probability of making a type I error and of
making a type II error.

     Define:
         M  : f(ZETA_1) = 2
              the event that ZETA_1 occurs twice in six rolls
         H0 : The probability of ZETA_1 is 1/6
         H1 : The probability of ZETA_1 is 2/6

    Notation:
      Pr(A|B) : The probability of A given B


    From Bayes's Theorem we then have:

                   Pr(M|H0) Pr(H0)
        Pr(H0|M) = ---------------
                       Pr(M)

    applying the principle of total probability to Pr(M)

                           Pr(M|H0) Pr(H0)
        Pr(H0|M) = ---------------------------------
                   Pr(M|H0) Pr(H0) + Pr(M|H1) Pr(H1)


    and if we allow both hypothesis H0 and H1 to be equally likely
    so that Pr(H0) = Pr(H1) = 1/2

                       Pr(M|H0)
        Pr(H0|M) = -------------------
                   Pr(M|H0) + Pr(M|H1)

    Similarly, we can repeat the above steps to obtain

                        Pr(M|H1)
        Pr(H1|M) = -------------------
                   Pr(M|H0) + Pr(M|H1)


>From these two formulas and the tables above we can now calculate the
probability of making a type I error and a type II error.

    First note that
        Pr(M|H0) = probability of rolling ZETA_1 two out of 6 times
                   when probability of rolling ZETA_1 is 1/6
                 = 0.200938  (from table above)

        Pr(M|H1) = probability of rolling ZETA_1 two out of 6 times
                   when probability of rolling ZETA_1 = 2/6
                 = 0.329218  (from table above)

    Then from the above formulas and data we have:

                                          Pr(M|H0)
    Pr( type I error )  = Pr(H0|M) = -------------------
                                     Pr(M|H0) + Pr(M|H1)

                                          0.200938
                                   = -------------------
                                     0.200938 + 0.329218

                                   = 0.379


                                          Pr(M|H1)
    Pr( type II error ) = Pr(H1|M) = -------------------
                                     Pr(M|H0) + Pr(M|H1)

                                          0.329218
                                   = -------------------
                                     0.200938 + 0.329218

                                   = .621

In this case we have a 38% chance of being wrong if we choose to
believe Pr(ZETA_1) = 2/6 and a 62% chance of being wrong if we choose
to believe Pr(ZETA_1) = 1/6.  Either way we're likely to be wrong.  The
data does not allow us to differentiate between the two possibilities
with much confidence.


Let us repeat the above calculations this time assuming we rolled
ZETA_1 five out of six times.  This would lead us to the estimate that
Pr(ZETA_1) = 5/6.  Let us create a new hypothesis:

     H1 : The probability of ZETA_1 is 5/6

and let us repeat the binomial calculations assuming hypothesis H1 is
true (the probability of ZETA_1 is 5/6), we arrive at the following
table:
       f(ZETA_1)     Pr ( f(ZETA_1) | H1 )   (sideways bar graph)
      -----------   ----------------------
          0               0.000021           |
          1               0.000430           |
          2               0.008037           |
          3               0.053583           |==
          4               0.200938           |==========
          5               0.401877           |====================
          6               0.334897           |================

Following the same procedure as above:

     Define:
         M  : f(ZETA_1) = 5
              the event that ZETA_1 occurs five out of six rolls
         H0 : The probability of ZETA_1 is 1/6
         H1 : The probability of ZETA_1 is 5/6

                                          Pr(M|H0)
    Pr( type I error )  = Pr(H0|M) = -------------------
                                     Pr(M|H0) + Pr(M|H1)

                                          0.000643
                                   = -------------------
                                     0.000643 + 0.401877

                                   = 0.001


                                          Pr(M|H1)
    Pr( type II error ) = Pr(H1|M) = -------------------
                                     Pr(M|H0) + Pr(M|H1)

                                          0.401877
                                   = -------------------
                                     0.000643 + 0.401877

                                   = .998

In this case, given a choice between believing the theoretician who
claims Pr(ZETA_1) = 1/6 and believing our own tests which claim
Pr(ZETA_1) = 5/6, if we choose to believe the theoretician we have a
99.8% chance of being wrong, and if we choose to believe our own
experimental results we have a 0.1% chance of being wrong.  Clearly the
odds are in favor of our being right.  And we only threw the die six
times!

This then is why a sequence consisting of all ones { 1, 1, ..., 1 } is
unlikely to be from a compound experiment in which each outcome is
equally likely.  If we hypothesize that each outcome is equally likely
(hypothesis H0), then all elements in our sample space Sn do have an
equal probability.  But if we conjure up an alternative hypothesis
about the probability of outcomes (hypothesis H1), then all the
elements in our sample space Sn no longer have an equal probability.
The probability will be higher for some elements and lower for other
elements.  Our hypothesis H1 is specifically formulated to give
whatever sequence we actually got the highest probability possible.
If the resulting probability of our sequence occurring under our
original H0 hypothesis is very low, then it's almost for certain that
it will loose out under an alternate hypothesis H1.  Thus if all we
want to do is test whether or not to reject our original H0 hypothesis,
all we need to do is see what the probability of our sequence occurring
under our original H0 hypothesis is.  If it is very low, we can reject
our H0 hypothesis under the implied knowledge that a better hypothesis
H1 probably exists.  We don't actually have to formulate a better
hypothesis H1, we just have to know that we could make one if we cared
to.

Thus from here on out we won't go the whole distance of formulating an
alternative hypothesis H1 and then calculating the probabilities of
type I errors and type II errors.  We'll just say if an actual outcome
of a compound experiment has a very low probability of occurring under
our H0 hypothesis (or null hypothesis), then that hypothesis is
probably not true.



3.2  GENERAL TEST PROCEDURE

Our general procedure then for testing the results of a compound
experiment is:

     1.  Formulate a null hypothesis H0 about the single chance
         experiment which was repeated N times to create a sequence of
         N values.  To test a sequence of supposedly random numbers,
         our null hypothesis H0 is that each outcome of the chance
         experiment is equally likely, and that each trial of the
         chance experiment is independent of all previous trials.

     2.  Formulate a real valued function g which somehow tests the
         null hypothesis H0.  To test a sequence of supposedly random
         numbers, our function g might be one which counts the number
         of occurrences of a particular outcome.

     3.  Under our null hypothesis H0, mathematically define a sequence
         of N random variables:

                       ( X1, X2, ..., Xn )

         and apply our function g to the sequence of N random variables
         creating a new random variable Z:

                  Z = g( X1, X2, ..., Xn )

         Then determine the probability density function of Z either by
         mathematical calculations or by finding a table of the
         particular probability density function we're interested in.

     4.  Take the specific sequence of values supposedly obtained by
         performing a chance experiment N times:                       ( x1, x2, ..., xn )

         and apply the function g to obtain a specific value z

                  z = g( x1, x2, ..., xn )

     5.  Determine from the probability density function of Z how
         likely or unlikely we are to obtain our value of z assuming
         our null hypothesis H0 is true.  If the probability is very
         low, we may reject our hypothesis H0 as being most likely
         incorrect.



3.3  THE CHI-SQUARE TEST

We come now to our first real test for a sequence of random numbers.
The test is the Chi-Square (ki-square) test.  It tests the assumption
that the probability distribution for a given chance experiment is as
specified.  For our die throwing experiment, it tests the probability
that each possible outcome is equally likely with a probability of 1/6.
The formula for our Chi-Square g function is:

      Define:
           N       = number of trials
           k       = number of possible outcomes of the chance experiment
         f(ZETA_i) = number of occurrences of ZETA_i in N trials
         E(ZETA_i) = The expected number of occurrences of ZETA_i in N
                     trials.  E(ZETA_i) = N*Pr(ZETA_i).


                 i=k  [ f(ZETA_i) - E(ZETA_i) ]**2
         CHISQ = SUM  ----------------------------
                 i=1          E(ZETA_i)

We now go through the steps as defined above:
   1.  Formulate the null hypothesis H0.  Our null hypothesis H0 for
       the die throwing experiment is that each of the six possible
       outcomes is equally likely with a probability of 1/6.

   2.  Formulate a real valued function g which somehow tests the null
       hypothesis.  Our g function is the chi-square test.  (This is
       not exactly the chi-square function, but for N large enough it
       approaches the chi-square distribution.)  Note that if the
       result CHISQ is zero, then each possible outcome occurred exactly
       the expected number of times.  On the other hand, if the result
       CHISQ is very large, it indicates that one or more of the
       possible outcomes of the chance experiment occurred far less or
       far more than would be expected if our null hypothesis H0 was
       true.

   3.  Determine the probability density function of Z = g(X1,X2,...Xn).
       As mentioned in step 2, our chi-square test function g is not
       exactly the chi-square function, but for N large enough it
       approaches the chi-square distribution of which tables are
       available for in any book on probability theory or any book of
       mathematical tables such as the CRC Standard Mathematical Tables
       book.  The stated rule of thumb is N should be large enough so
       that every possible outcome is expected to occur at least 5
       times.  Thus for our die throwing experiment we should repeat it
       at least 30 times, since 30*(1/6) = 5.

       Each row of the chi-square distribution table is for a
       particular number of "degrees of freedom" which we will
       abbreviate to df.  The value to choose for df is one less than
       the number of possible outcomes for our chance experiment.  Thus
       for our die throwing experiment where there are 6 possible
       outcomes, we have df = 6 - 1 = 5.

       Below is the chi-square distribution for 5 degrees of freedom:

           p=1%     p=5%    p=25%   p=50%   p=75%   p=95%   p=99%
          0.5543   1.1455   2.675   4.351   6.626   11.07   15.09

       This is the cumulative chi-square distribution.  If we recall
       the distribution itself is typically bell shaped, then the
       extremes of the bell shaped curve where the probability is very
       low corresponds to the extremes of the cumulative distribution.
       Thus if we are below the p=5% mark or above the p=95% mark, the
       corresponding probability is quite low.  On the other hand, if
       we are between the p=25% mark and the p=75% mark, the
       corresponding probability is high.

       (Note that some tables list the probability for (1 - p) instead
       of p, so that 0.5543 is listed as p=99% instead of p=1%, and
       15.09 is listed as p=1% instead of p=99%.  It should be easy to
       tell by inspection which probability the table is listing.)

   4.  We can now apply our chi-square test to a specific outcome of
       rolling the die 30 times.  Assume we threw the die 30 times and
       obtained the following results:

              f(ZETA_1) = 7
              f(ZETA_2) = 5
              f(ZETA_3) = 4
              f(ZETA_4) = 6
              f(ZETA_5) = 6
              f(ZETA_6) = 2

       Then our CHISQ test becomes:

        (7-5)**2   (5-5)**2   (4-5)**2   (6-5)**2   (6-5)**2   (2-5)**2
CHISQ = -------- + -------- + -------- + -------- + -------- + --------
        30*(1/6)   30*(1/6)   30*(1/6)   30*(1/6)   30*(1/6)   30*(1/6)
      = 3.2

   5.  We can now check our result of 3.2 against the chi-square table
       above in step 3 and note that it falls between the p=25% mark
       and the p=50% mark.  This is a high probability region and so we
       are not led to believe our null hypothesis H0 is false.


Further notes on chi-square:

   Note that not only are overly large values of CHISQ considered
   highly improbable, but so are overly small values of CHISQ.  Whereas
   overly large values of CHISQ lead us to believe the probabilities of
   each possible outcome of our chance experiment are not what we
   thought them to be, overly small values of CHISQ lead us to believe
   the N trials of our chance experiment were not independent.

   As an example, if we threw a single die 6 times, there would be 6**6
   or 46656 possible outcomes.  If we calculate the CHISQ value for
   each of the possible values and tabulate them we get the following
   table:

             CHISQ   f(CHISQ)
               0       720
               2       10800
               4       16200
               6       9000
               8       7200
              12       2100
              14       450
              20       180
              30       6

   Although a CHISQ value of 0 happens 720 times, a CHISQ value of 2
   happens 10800 times, making it much more likely.  We can compare our
   CHISQ values here with the chi-square distribution table above for 5
   degrees of freedom.  Our values of CHISQ = 2 through 12 which appear
   to be the most likely ones lie above the p=5% mark and below the
   p=95% mark on the table, where we would expect them to lie.  Note
   that this comparison was for N = 6, although we have as a rule of
   thumb that N should be at least 30 for this test.

   Another note about the chi-square distribution, most tables of the
   chi-square distribution go up to 30 degrees of freedom.  Above that
   the distribution approaches the normal distribution.  If your chance
   experiment has more than 30 degrees of freedom (number of possible
   outcomes - 1), you can use the following formula to convert the CHISQ
   distribution for df > 30 into the normal distribution with mean = 0
   and variance = 1:

        given:  CHISQ (and) df

        Z = [ SQRT(24*CHISQ - 6*df + 16) - 3*SQRT(2*df) ] / 4

        Here Z is a normally distributed variable with mean = 0 and
        variance = 1.  As an example, let's assume we are simulating
        the random choosing of a single card from a deck of 52.  There
        are 52 possible outcomes.  We will need to repeat this
        experiment 52*5=260 times so that each possible outcome occurs
        about 5 times.  Our number of degrees of freedom is 52 - 1 = 51.
        We calculate CHISQ and get CHISQ = 30.4 .  We now calculate
        Z from the above equation given CHISQ = 30.4 and df = 51:

           Z = [ SQRT(24*30.4 - 6*51 + 16) - 3*SQRT(2*51) ] / 4
             = -2.33

        We now look up in a table of the cumulative normal distribution
        and find this value of Z corresponds to the first 1% of the
        distribution.  Thus our CHISQ value is much lower than
        expected.  We could conclude that something is probably amiss.

   One last note about chi-square.  If you are simulating the flipping
   of a coin, then the number of degrees of freedom is 1 (two possible
   outcomes).  Our rule of thumb about performing enough experiments so
   that each possible outcome occurs at least 5 times is not adequate
   in this case.  Because the number of degrees of freedom is so low,
   we need to compensate for this by increasing our number of trials.
   Enough trials so that each outcome is expected to occur at least 75
   times is required in this case.



3.4  THE KOLMOGOROV-SMIRNOV TEST

If we repeat the die throwing experiment 30 times, we can perform one
chi-square test on the results.  If we repeat the die throwing
experiment 3000 times, we can either perform one chi-square test to get
an idea of the long term probabilities of each possible outcome, or we
can perform 300 chi-square tests for each block of 30 trials to get an
idea of the relative short term fluctuations.  In the latter case, we
would expect the distribution of our 300 chi-square tests to approach
the chi-square distribution.

This brings us to our next subject, which is testing a set of data
points to see how closely it matches a given continuous distribution.
The Kolmogorov-Smirnov (KS) test compares a set of nc data points Snc
(e.g. set of a number of chi-square data points) with a given
cumulative distribution function P(x) (e.g. the chi-square cumulative
distribution).

Given a set of nc data points Snc, we can create a cumulative
distribution function Snc(x) from the data points.  The function will
be a step function, rising from 0 to 1, with each step size equal to
1/nc.  Below is an example which compares a cumulative distribution
step function Snc(x) for 4 data values against a continuous cumulative
distribution P(x) for a function which is uniform between 0 and 1 (the
diagonal line):

                 |
              1.0|               +/--
                 |              /|
                 |       +----/--+
                 |       |  /
                 |     +-+/
                 |     |/
                 |   +/+
                 |  /|
             0.0 |/--+---------------
                 0                1

The KS test is now very simple.  Define D as the maximum vertical
difference between these two functions:

          D = max |Snc(x) - P(x)|

And define DQ as:

          DQ = SQRT(nc)*D

where nc is the number of data points.

We can then go directly to a table for the KS test (similar to the
chi-square table), and read off where in the cumulative distribution
our DQ value lies, remembering that values below p=5% only occur 5% of
the time, and values above p=95% only occur 5% of the time.  Tables for
this KS test are not as easy to find as tables for the chi-square test.
Fortunately, as in the chi-square case, if our number of data values nc
in set Snc is greater than 30, the distribution approaches something
which can be easily calculated as follows:

        Given:  D = max |Snc(x) - P(x)|
                nc > 30  (number of data points)

                                       1
                p = 1  -  --------------------------------
                               2*D**2      2*D         1
                          exp( ------ + ---------- + ----- )
                                 1      3*SQRT(nc)   18*nc


----------------------------------------------------------------------



4.0  ANALYSIS OF A SEQUENCE OF RANDOM NUMBERS

Enough theory for now.  Let's turn to the analysis of a sequence of
random numbers.  There's nothing "random" about any specific sequence of
numbers.  But we can analyze the sequence of numbers to see if we can
find any sufficient reason to doubt the hypothesis that the numbers
could have come from multiple independent trials of a chance experiment.



4.1  ONE-DIMENSIONAL TESTS

Our first test then is to see if each possible outcome appears to be
equally likely.  Imagine we have a number of bins, and we toss balls at
the bins.  After throwing a certain number of balls at the bins, we
count up how many balls fell in each bin.  We then apply the chi-square
test to determine if the balls appear to have an equal probability of
falling in each bin.

The MTH$RANDOM and RANDU generators can be set up to generate numbers
between 1 and NBINS as follows:

      I = INT( NBINS * RAN(SEED)             ) + 1    !MTH$RANDOM
      I = INT( NBINS * FOR$IRAN(SEED1,SEED2) ) + 1    !RANDU

Arbitrarily choosing NBINS = 30 (number of bins), SEED = 1 (initial SEED
value), and BALLS = 300, the chi-square test was performed 10 times on
the MTH$RANDOM generator with the following results:

       30 BINS, 300 BALLS, 10 CHI-SQUARE TESTS, MTH$RANDOM
       BALLS= 300      CHISQ= 35.1999969     PROB= 0.8019531
       BALLS= 300      CHISQ= 22.8000031     PROB= 0.2143845
       BALLS= 300      CHISQ= 36.7999992     PROB= 0.8485908
       BALLS= 300      CHISQ= 19.8000011     PROB= 0.1009650
       BALLS= 300      CHISQ= 48.7999992     PROB= 0.9878765
       BALLS= 300      CHISQ= 29.3999996     PROB= 0.5556139
       BALLS= 300      CHISQ= 22.8000011     PROB= 0.2143840
       BALLS= 300      CHISQ= 36.5999985     PROB= 0.8432655
       BALLS= 300      CHISQ= 29.3999996     PROB= 0.5556139
       BALLS= 300      CHISQ= 18.6000004     PROB= 0.0688844

Here CHISQ is the chi-square value, and PROB is the percentage of random
sequences (with 30 - 1 = 29 degrees of freedom) which would have a lower
CHISQ value.  An extremely high value of PROB, say above 0.990, would
indicate that some bins had a much higher than expected number of balls
in them while other bins have a much lower than expected number of balls
in them.  In general, an extremely high value of PROB indicates that the
experimental results do not reinforce the hypothesis that each bin
had an equally likely chance of getting a ball.  So far all of the PROB
values appear reasonable.

We can further analyze the results of the 10 chi-square tests above by
noting that the distribution of the 10 CHISQ values should approximate
the chi-square distribution (for 30 - 1 = 29 degrees of freedom).  We
can apply the KS test to the 10 CHISQ values to see how well their
distribution approximates the chi-square distribution for d.f. = 29.

       KS D= 0.2019531  PROB= 0.8093885

Here D is the KS D value which is the maximum distance between the
approximate and the real chi-square function, and PROB is the
percentage of times we would have a larger D value, meaning the
approximation could be worse.  A very low value of PROB here means it
couldn't get much worse.  Anything above a very low value means our
distribution bears at least some resemblance to the function we're
comparing it to.  Again this result appears reasonable.

The following values were obtained when the above chi-square test was
applied 10 times to the RANDU generator:

       30 BINS, 300 BALLS, 10 CHI-SQUARE TESTS, RANDU
       BALLS= 300      CHISQ= 31.3999996     PROB= 0.6531839
       BALLS= 300      CHISQ= 60.7999992     PROB= 0.9995093
       BALLS= 300      CHISQ= 33.4000015     PROB= 0.7381005
       BALLS= 300      CHISQ= 24.3999977     PROB= 0.2910264
       BALLS= 300      CHISQ= 20.8000011     PROB= 0.1336691
       BALLS= 300      CHISQ= 16.6000023     PROB= 0.0319657
       BALLS= 300      CHISQ= 32.0000000     PROB= 0.6801265
       BALLS= 300      CHISQ= 30.2000027     PROB= 0.5959312
       BALLS= 300      CHISQ= 31.2000027     PROB= 0.6439425
       BALLS= 300      CHISQ= 45.5999985     PROB= 0.9742958

And a KS test on the above 10 CHISQ values:
      KS D= 0.2959312  PROB= 0.3452127

Again there is nothing as yet unusual about either the MTH$RANDOM or
RANDU generators.

We can also test the MTH$RANDOM and RANDU generators using the KS test
directly to see how well the output fits the hypothesized uniform
distribution.


      KS TEST ON MTH$RANDOM  (Start with SEED = 1)
      KS NPOINTS=      10    D= 0.2142200    PROB= 0.7484154
      KS NPOINTS=     100    D= 0.0944867    PROB= 0.3338285
      KS NPOINTS=    1000    D= 0.0314864    PROB= 0.2746516
      KS NPOINTS=   10000    D= 0.0079555    PROB= 0.5514056
      KS NPOINTS=  100000    D= 0.0017763    PROB= 0.9105781
      KS NPOINTS= 1000000    D= 0.0009270    PROB= 0.3565834


      KS TEST ON RANDU  (Start with SEED = 1)
      KS NPOINTS=      10    D= 0.6555050    PROB= 0.0003705
      KS NPOINTS=     100    D= 0.1326773    PROB= 0.0591586
      KS NPOINTS=    1000    D= 0.0337385    PROB= 0.2050490
      KS NPOINTS=   10000    D= 0.0063568    PROB= 0.8138239
      KS NPOINTS=  100000    D= 0.0042999    PROB= 0.0495556
      KS NPOINTS= 1000000    D= 0.0007990    PROB= 0.5457213


Note the only thing questionable here is the extremely low value of
PROB on RANDU for the first 10 points.  This indicates that perhaps a
seed value of 1 for the RANDU generator is not a good place to start
it.  Indeed, the first 10 numbers from RANDU starting with SEED = 1
are:

     RANDU first 10 numbers starting with seed = 1
           1      0.0001831
           2      0.0032959
           3      0.0444950
           4      0.5339386
           5      0.0068024
           6      0.8734164
           7      0.1705012
           8      0.3222913
           9      0.9906484
          10      0.7260775

Notice that half of the first 10 values are either below 0.006 or above
0.99.  This alone does not indicate a problem with the RANDU generator,
it does indicate that a SEED value of 1 is a poor choice to start it
with.  Note that the MTH$RANDOM generator does not have this problem when
started with SEED = 1.



4.2  TWO-DIMENSIONAL TESTS

Our original hypothesis about our sequence of supposedly random numbers
was that it could have come from a repeated chance experiment in which
each possible outcome was equally likely and each test of the chance
experiment was independent of all previous tests.  So far we have been
testing the sequence to see if each possible outcome appears to be
equally likely.  The remainder of our tests now focus on checking the
hypothesis that each outcome is independent of all previous outcomes.

If we throw a die and get a 5, what is the probability that the next
throw of the die will be a 3?  If the probability is still 1/6, we say
the next outcome is independent of the previous outcome.  i.e. knowledge
of the previous outcome does not give us any help in predicting the next
outcome.

The mathematical definition of statistical independence between two
random variables can be written as:

        Pr( X(i+1) | X(i) ) = Pr( X(i+1) )

The serial test in two dimensions checks to see if there is any
correlation between two consecutive outcomes of the random number
generator.  This is a chi-square test performed on pairs of random
numbers.  For the die throwing experiment, we would construct a total of
36 bins numbered BIN(1,1) - BIN(6,6).  One bin for each possible pair of
numbers we may get.  We then throw the die twice to get an ordered pair
of numbers, and drop a ball in the corresponding bin.  We throw the die
enough times so that each bin is expected to get an average of at least
5 balls.  Then we perform a chi-square test on the results to see if the
balls appear to be more or less evenly distributed among all the bins.

This serial test for pairs of outcomes was done on both the MTH$RANDOM
and RANDU generators with the following results:

  SERIAL 2D  BINS=30x30  BALLS = 9000  CHI-SQUARE TESTS = 10  MTH$RANDOM
       BALLS= 9000     CHISQ= 895.7998657    PROB= 0.4761399
       BALLS= 9000     CHISQ= 945.2001343    PROB= 0.8615244
       BALLS= 9000     CHISQ= 883.6000366    PROB= 0.3633031
       BALLS= 9000     CHISQ= 905.0000000    PROB= 0.5624363
       BALLS= 9000     CHISQ= 902.3989868    PROB= 0.5382197
       BALLS= 9000     CHISQ= 911.8001709    PROB= 0.6241364
       BALLS= 9000     CHISQ= 932.4005737    PROB= 0.7863315
       BALLS= 9000     CHISQ= 865.4000854    PROB= 0.2157318
       BALLS= 9000     CHISQ= 909.5996704    PROB= 0.6043593
       BALLS= 9000     CHISQ= 901.7994385    PROB= 0.5325246

KS test to see how well the 10 CHISQ values fit the chi-square
distribution:
      KS D= 0.2667681   PROB= 0.4751010


  SERIAL 2D  BINS=30x30  BALLS = 9000  CHI-SQUARE TESTS = 10  RANDU
       BALLS= 9000     CHISQ= 848.7999268    PROB= 0.1168594
       BALLS= 9000     CHISQ= 891.7999878    PROB= 0.4386667
       BALLS= 9000     CHISQ= 874.1998291    PROB= 0.2827876
       BALLS= 9000     CHISQ= 884.1997681    PROB= 0.3687753
       BALLS= 9000     CHISQ= 799.5999756    PROB= 0.0077433
       BALLS= 9000     CHISQ= 841.8005981    PROB= 0.0864621
       BALLS= 9000     CHISQ= 867.8007202    PROB= 0.2330545
       BALLS= 9000     CHISQ= 892.6010132    PROB= 0.4461077
       BALLS= 9000     CHISQ= 952.4000854    PROB= 0.8945085
       BALLS= 9000     CHISQ= 921.5999146    PROB= 0.7069101

KS test to see how well the 10 CHISQ values fit the chi-square
distribution:
      KS D= 0.3631590   PROB= 0.1430003

So far there is nothing blatantly wrong with either the MTH$RANDOM or
RANDU generators.  All the chi-square values are within reason, and the
distribution of the chi-square values appears to fit the chi-square
distribution.


4.3  THREE-DIMENSIONAL TESTS

We can now expand the serial test to triples of numbers.  Suppose you
are throwing a die, and you know that the results of the last two throws
were a 2 and then a 5.  What is the probability that you will now throw
a 3?  It should still be 1/6, because the outcome of this throw should
not be affected by the outcome of the previous two throws.  We can test
for this by constructing 6**3 = 216 bins labeled BIN(1,1,1) -
BIN(6,6,6), one bin for each possible triple of outcomes.  We then
throw the die three times to get a triple of numbers, and then drop
a ball in the corresponding bin.  We throw the die enough times so that
each bin is expected to get an average of at least 5 balls.  Then we
perform a chi-square test on the results to see if the balls appear to
be more or less evenly distributed among the bins.

This 3-dimensional test extends the test for statistical independence. 
If we assume the first two variables are independent, then this test
checks to see if the third variable is also independent.  The
mathematical definition of statistical independence between two random
variables can be written as:

        Pr( X(i+2) | X(i+1),X(i) ) = Pr( X(i+2) )

This serial test for triples of outcomes was done on both the
MTH$RANDOM and RANDU generators.  The generators were set up to simulate
the throwing of a 30 sided die.  A total of 270000 balls were tossed so
that an average of 10 balls were expected to fall in each bin.  A
chi-square test was then performed on the results.  This was repeated a
total of 10 times, and a KS test was done on the resulting chi-square
values to see how well they fit the chi-square distribution.  Below are
the results:

       SERIAL 3D  BINS=30x30x30  BALLS = 270000  MTH$RANDOM
       BALLS= 270000    CHISQ= 27233.4375000    PROB= 0.8438070
       BALLS= 270000    CHISQ= 26732.8027344    PROB= 0.1262939
       BALLS= 270000    CHISQ= 26866.4550781    PROB= 0.2845250
       BALLS= 270000    CHISQ= 26765.3710938    PROB= 0.1561499
       BALLS= 270000    CHISQ= 26650.6250000    PROB= 0.0659529
       BALLS= 270000    CHISQ= 26665.5117188    PROB= 0.0751096
       BALLS= 270000    CHISQ= 27165.1523438    PROB= 0.7621238
       BALLS= 270000    CHISQ= 26861.5625000    PROB= 0.2786521
       BALLS= 270000    CHISQ= 27002.1171875    PROB= 0.5027421
       BALLS= 270000    CHISQ= 27090.8613281    PROB= 0.6547577

KS test to see how well the 10 CHISQ values fit the chi-square
distribution:
       KS D= 0.3162388   PROB= 0.2699622

       SERIAL 3D  BINS=30x30x30  BALLS = 270000  RANDU
       BALLS= 270000    CHISQ= 454159.3750000    PROB= 1.0000000
       BALLS= 270000    CHISQ= 452698.1250000    PROB= 1.0000000
       BALLS= 270000    CHISQ= 455055.3437500    PROB= 1.0000000
       BALLS= 270000    CHISQ= 455938.2812500    PROB= 1.0000000
       BALLS= 270000    CHISQ= 454747.7187500    PROB= 1.0000000
       BALLS= 270000    CHISQ= 453017.8125000    PROB= 1.0000000
       BALLS= 270000    CHISQ= 453818.7812500    PROB= 1.0000000
       BALLS= 270000    CHISQ= 454553.5937500    PROB= 1.0000000
       BALLS= 270000    CHISQ= 453205.6875000    PROB= 1.0000000
       BALLS= 270000    CHISQ= 454122.0625000    PROB= 1.0000000

KS test to see how well the 10 CHISQ values fit the chi-square
distribution:
       KS D= 1.0000000    PROB= 0.0000000

It is immediately obvious that the RANDU generator has failed miserably
on the serial test for triples!  The chi-square value is way too high
on every one of the 10 trials.



4.4  HIGHER DIMENSIONAL TESTS

We can easily expand the chi-square tests for higher dimensions.  One
thing to note though, the number of bins increases exponentially as the
number of dimensions increases.  Thus there is a practical limit as to
how high a dimension we can carry this test.  In section 6 (Ratings for
Various Generators) we analyze 5 popular random number generators by
using the chi-square test for dimensions 1 through 8.  Higher
dimensions are not as serious since such a large number of random
numbers are required before some nonrandomness can be detected.

Before we use these tests to analyze some random number generators we
should first take a look at what's inside these generators and how to
properly use them.

---------------------------------------------------------------------------



5.0  LINEAR CONGRUENTIAL RANDOM NUMBER GENERATORS

Both the MTH$RANDOM and the RANDU algorithms are specific instances of
linear congruential generators.  The general form of a linear
congruential generator is:

        SEED = (A * SEED + C) mod M

        X = SEED/M

The user supplies an initial integer value for SEED.  On each call to
the random number generator a new value of SEED is calculated by taking
the current value of SEED, multiplying by A, adding C, and taking the
remainder MOD M of the result.  The new value of SEED is an integer
between 0 and M-1.  This new value of SEED is then converted into a
floating point value between 0 inclusive and 1 exclusive by dividing
SEED by M.  The result X is returned as a floating point random number
in the range [0,1).

First note some obvious properties of sequences generated by this method:

    1.  The modulus M is an upper bound on the number of different values
        SEED can take on.

    2.  If the new value of SEED is the same as one we had before,
        the sequence will begin to repeat itself in a cyclic
        manner.

    3.  All sequences generated by a linear congruential formula will
        eventually enter a cycle which repeats itself endlessly.

A good linear congruential formula will generate a long sequence of
numbers before repeating itself.  The maximum length a sequence can
possibly be before repeating itself is M, the modulus.  This is because
there are only M different possible values SEED can take on.  A linear
congruential formula will generate a sequence of maximum length M if and
only if the following three conditions are met (See Knuth for a proof of
this theorem):

      i)  C is relatively prime to M;
     ii)  B = (A - 1) is a multiple of P, for every prime P dividing M;
    iii)  B = (A - 1) is a multiple of 4, if M is a multiple of 4.



5.1  THE MTH$RANDOM ALGORITHM:

We now take a look at the MTH$RANDOM function.  The algorithm MTH$RANDOM
uses to generate successive random numbers is:

        SEED = (69069*SEED + 1) MOD 2**32

        X = SEED/2**32

This is a linear congruential generator with:

        A = 69069
        C = 1
        M = 2**32

Note first that MTH$RANDOM satisfies the three necessary and sufficient
conditions above which insure that the sequence it generates is of maximum
length:

      i)  1 is relatively prime to 2**32
          since 1 is relatively prime to all numbers.
     ii)  69068 is a multiple of 2, which is the only prime dividing 2**32.
    iii)  69068 is a multiple of 4.

Thus the sequence generated by MTH$RANDOM is of maximal length,
generating all 2**32 possible values of SEED before repeating itself.

Note for the MTH$RANDOM function if SEED is initially an ODD value
then the new value of SEED will always be an even value.  And if SEED
is an EVEN value, then the new value of SEED will be an ODD value.
Thus if the algorithm is repeatedly called, the value of SEED will
alternate between EVEN and ODD values.



5.2  THE RANDU ALGORITHM

Now let us look at the RANDU function.  The algorithm RANDU uses togenerate successive random numbers is:

        SEED = (65539*SEED) MOD 2**31

        X = SEED/2**31

This is a linear congruential generator with:

        A = 65539
        C = 0
        M = 2**31

Note first that the modulus M for RANDU is 2**31 whereas the modulus M
for MTH$RANDOM is 2**32.  The maximum sequence length for RANDU is thus
at least 1/2 of what it is for MTH$RANDOM.  Note also that if SEED is
initially an odd value, the new SEED generated will also be an odd
value.  Similarly, if SEED is initially an even value, the new SEED
generated will also be an even value.  Thus there are at least two
separate disjoint cycles for the RANDU generator, depending upon whether
you start with an even SEED value or and odd SEED value.

Actually, the situation is even worse than that.  For odd SEED values
there are two separate disjoint cycles, one generated by the SEED value
1, and one generated by the SEED value 5.  Although we can generate each
full cycle by initializing SEED to any value the cycle takes on, it is
convenient to refer to the smallest SEED value the cycle takes on as the
one which generates the cycle.  This helps us differentiate between
different cycles.  We will denote the sequence generated by the SEED
value 1 as <1>, and the sequence generated by the SEED value 5 as <5>.

The cycles <1> and <5> each contain 536,870,912 values.  Together they
account for all of the (2**31)/2 possible odd SEED values.  The
following table summarizes the cycles for odd values of SEED:

    TABLE 5.2.1  RANDU WITH ODD VALUES OF SEED

           CYCLE     LENGTH OF CYCLE
             <1>     536,870,912
             <5>     536,870,912


Even values of SEED do not have it so nice.  There are 30 different
disjoint cycles using even SEED values.  Some of them are quite short:

    TABLE 5.2.2  RANDU WITH EVEN VALUES OF SEED

           CYCLE     LENGTH OF CYCLE
             <2>     268435456
             <4>     134217728
             <8>      67108864
            <10>     268435456
            <16>      33554432
            <20>     134217728
            <32>      16777216
            <40>      67108864
            <64>       8388608
            <80>      33554432
           <128>       4194304
           <160>      16777216
           <256>       2097152
           <320>       8388608
           <512>       1048576
           <640>       4194304
          <1024>        524288
          <1280>       2097152
          <2048>        262144
          <2560>       1048576
          <4096>        131072
          <5120>        524288
          <8192>         65536
         <10240>        262144
         <16384>         32768
         <20480>        131072
         <32768>         16384
         <40960>         65536
         <81920>         32768
        <163840>         16384
                     ---------
        Total:      1073709056

There are a total of (2**31)/2 = 1,073,741,824 possible even SEED
values.  So far we've accounted for 1,073,709,056 of them.  The
remaining 32768 SEED values are ones for which the 31 bit binary
representation of them has the lower 16 bits set to 0.  These SEED
values are treated by RANDU as if the SEED value were 1, and they
result in the cycle <1>.
We can see from this that even values of SEED should be avoided when
using RANDU since they can give us very short cycles.

-----------------------------------------------------------------------



5.3  STARTING THE MTH$RANDOM GENERATOR

The initial SEED value for the MTH$RANDOM generator can be any integer
value between 0 and (2**32)-1 = 4294967295.  If we could "randomly"
choose a number in this range and insure that each number had an equal
probability of being chosen, we'd be set.  However, human nature being
what it is, simply picking a number out of the air usually does not
satisfy the criterion for randomness, especially when there are over
four billion numbers to choose from.

For many applications it doesn't matter what the initial few numbers
from the random number generator are, and ANY initial seed value will
suffice.  Note also some myths about choosing the initial seed value
for the MTH$RANDOM generator.  The VAX FORTRAN manual recommends using
an odd value.  This piece of advice is obviously left over from the
RANDU days.  The obsolete RANDU generator does require an odd seed
value, but for the MTH$RANDOM generator the seed value alternates
between even and odd values at each iteration, so there is no advantage
to starting with an odd value.

Another incorrect comment about the MTH$RANDOM generator can be found
in the VMS source code comments.  There it indicates that the
MTH$RANDOM generator might do poorly in a 3-D test for randomness, but
actually MTH$RANDOM does quite well in a 3-D test.  It's the RANDU
generator which does poorly in a 3-D test.

More important than starting the MTH$RANDOM generator to get
one random sequence is the problem of restarting the generator to get
several different sequences.  You may wish to run a simulation
several times and use a different random sequence each time.

One simple method would be to use consecutive seed values for each run.
For example, you could generate ten different random sequences using
initial seed values of 1 through 10.  If we compare side by side these
ten random sequences the first thing we'll note is the first value of
each sequence is nearly identical.  This is because there is a very
strong correlation between the initial SEED value and the first value
output from the generator.  The simple relationship between SEED and
the first output value is approximately:

                       SEED
                 X ~= ------- mod 1
                      62183.7

where mod 1 here indicates when SEED/62183.7 becomes greater than 1 we
wrap around and start counting from 0 again.

Thus if SEED changes by 10 the first value changes by only (10/62183.7)
~= 0.00016 .  Note that this correlation between the initial SEED value
and the first value output from the MTH$RANDOM generator will happen no
matter what the initial value of SEED is.  Large initial values of SEED
are no better than small initial values of SEED.

Besides the initial expected correlation of the first value output from
the generator, there are also unexpected correlations between other
sequence elements.  For the example above, not only are the first
values of all 10 sequences nearly identical, there is also a strong
correlation for elements 6, 13, 39, 41, 54, 60, 65, 72, 82, and 84.
You can play with program CORELA.FOR to see these correlations for
yourself.  Program CORELA generates the first 100 values for ten
different sequences, and then compares them.  Most of the elements are
quite random, but once in a while you come across an element which is
nearly identical in all ten sequences.  File CORELA.DOC shows some
sample outputs with comments.

It seems to be not all that easy to get away from this problem of
occasional correlation between different sequences.  If instead of
using the seed values {1,2,3,...,10} you used the seed values
{1,1001,2001,3001,...,9001}, you would still get occasional elements
which are strongly correlated, they just occur in slightly different
places.  In this example they occur at elements 1, 11, 17, 32, and 80.

For many applications this occasional correlation may not be important.
Each sequence when taken as a whole is quite different from the other
ones.

One method for dealing with the correlation between the initial SEED
value and the first number output from the random number generator is
to simply throw away the first number.  The second number output from
the generator has a much weaker correlation with the initial SEED
value.  It is much more "random" in that very small changes in the
initial SEED value produce very large and somewhat unpredictable
changes in it's value.  And since you're throwing away numbers you
might as well throw away the second number too.  The third number
output from the MTH$RANDOM generator is quite random compared to the
initial SEED value.

There is another way to view this operation of throwing away the first
two values.  We need to come up with a good "random" initial SEED value
to start the generator with.  So we can use our random number generator
to help us generate a random SEED value to start our random number
generator with!  We select an initial nonrandom SEED value, then run
our random number generator a few cycles to generate a random SEED
value.  We then restart our random number generator with the new random
SEED value, and the output is then a properly initialized random
sequence.  Note that this is the same as starting with a nonrandom SEED
value and throwing away the first few numbers output from the
generator.

Another method to deal with correlation is to use the shuffling
technique to be described in section 8 (Shuffling).  This technique
essentially removes all problems of correlation and also greatly
improves the randomness of the numbers.  If shuffling is used, then ANY
set of initial seed values may be used with confidence including
something as simple as {1,2,...10}.



5.4  STARTING THE RANDU GENERATOR

The RANDU generator should only be started with an odd SEED value, since
even initial SEED values can result in sequences with very short
periods.  Note that for RANDU, if the initial SEED value is odd, then
all subsequent SEED values will be odd.

The rest of the problems mentioned for MTH$RANDOM also apply to RANDU,
except that instead of throwing away the first two random numbers
output you should instead throw away the first ten numbers output.

Actually you shouldn't be using the RANDU generator at all because of
it's very poor 3-D performance.



5.5  STARTING THE ANSI C GENERATOR

The ANSI C generator is very much like the MTH$RANDOM generator.  All
comments for the MTH$RANDOM generator also apply to the ANSI C
generator.  See the comments above for the MTH$RANDOM generator.

---------------------------------------------------------------------------



6.0  RATINGS FOR VARIOUS GENERATORS

Here are the results of testing 5 different commonly found generators.
The generators tested are:

   1. MTH$RANDOM
      used by VAX FORTRAN and VAX BASIC   2. RANDU
      obsolete but still found on some systems

   3. C and ANSI C
      The standard rand() function for C and ANSI C as defined by
      Brian W. Kernighan and Dennis M. Ritchie in their book "The C
      programming language".  Used by VAX C.
   4. Microsoft C
      The rand() function in the Microsoft C library v4.0.

   5. Turbo Pascal
      The random function in Turbo Pascal v6.0



6.1  REVIEW OF TESTS PERFORMED

We'll review once more the tests which were performed on each
generator.
6.1.1  THE 1-D TEST

The 1-D test is the frequency test.  Imagine a number line stretching
from 0 to 1.  We will use our random number generator to plot random
points on this line.  First we divide the line into a number of
segments or "bins".  Below is an example of dividing the unit line into
4 segments or bins:

             1         2         3        4
        |---------|---------|---------|---------|
        0        .25       .50       .75       1.0

We then see how randomly the random number generator fills our bins.
If the bins are filled too unevenly, the Chi-Square test will give a
value that's too high indicating the points do not appear random.
For example, if all the points landed in bin 1 and no points landed in
any of the other bins, we would conclude that the generator did not
appear to be random.  On the other hand, if the bins are filled too
evenly, the Chi-Square test will give a value that's too low, also
indicating the points do not appear random.  For example, if after
plotting 1000 points each of the four bins had exactly the same number
of points in them, we would conclude that the generator was not random
enough.  The values given for the 1-D test is the approximate number
of bins above which the random number generator fails the Chi-Square
test in one dimension.



6.1.2  THE 2-D TEST

The 2-D test is the serial test.  We can imagine a unit square
stretching from (0,0) at one tip to (1,1) at the other tip.  We will
use the random number generator to plot points on this unit square.  To
plot a point we generate two random numbers and use them as the
coordinates.  But first we divide the unit square into a number of
smaller squares or "bins", in a gridlike pattern.  Below is an example
of dividing up the unit square into a total of 16 bins with 4 bins on a
side:

              1.0|---------|---------|---------|---------|
                 |         |         |         |         |
                 |         |         |         |         |
              .75|---------|---------|---------|---------|
                 |         |         |         |         |
                 |         |         |         |         |
              .50|---------|---------|---------|---------|
                 |         |         |         |         |
                 |         |         |         |         |
              .25|---------|---------|---------|---------|
                 |         |         |         |         |
                 |         |         |         |         |
                 |---------|---------|---------|---------|
                 0        .25       .50       .75       1.0

We then see how randomly the random number generator fills our bins,
the same as in the 1-D test.  The value given for the 2-D test is the
approximate number of bins per side or bins per dimension (bpd) above
which the random number generator fails the Chi-Square test in two
dimensions.

If we were to look at the unit square after we had plotted a large
number of points, we would notice that rather than appearing random,
the points all lined up on parallel lines.  All linear congruential
random number generators have this flaw.  We will have more to say
about this in section 7 (The Spectral Test).



6.1.3  THE 3-D TEST

The 3-D test is the logical extension of the 2-D test and the 1-D test.
Imagine a unit cube stretching from (0,0,0) at one tip to (1,1,1) at
the other tip.  We will use our random number generator to plot points
on this unit cube.  To plot a point we generate three random numbers
and use them as the coordinates.  We divide the unit cube into a number
of smaller cubes or "bins", and then see how randomly the random number
generator fills our bins, the same as we do for the 2-D test and 1-D
test.  The value given for the 3-D test is the approximate number of
bins per side above which the random number generator fails the
Chi-Square test in three dimensions.

If we were to look at the unit cube after we had plotted a large
number of points, we would notice that rather than appearing random,
the points all lined up on parallel planes.  All linear congruential
random number generators have this flaw.  The infamous RANDU
generator has only 16 parallel planes covering the unit cube, hence it
performs extremely poorly on the 3-D test.  We will have more to say
about this in section 7 (The Spectral Test).


6.1.4  THE 4-D TEST

The 4-D test and higher dimensional tests are again the logical
extension of the 3-D test, 2-D test, and 1-D test.  For an n-D or
n-dimensional test, we imagine an n-dimensional hypercube, divided in a
gridlike manner into smaller hypercubes or "bins".  We plot points in
this n-dimensional hypercube by generating n random numbers and using
them as the coordinates of a point.  We then perform a chi-square test
to see how evenly or unevenly our bins get filled with points.  The
value given for any n-D test is the approximate number of bins per
dimension (bpd) above which the random number generator failed the
chi-square test.  (The total number of bins in an N-dimensional
hypercube would be the number of bins per dimension raised to the Nth
power.)



6.2  TEST RESULTS

Each generator was tested for dimensions 1 through 8.  Here now are the
results:



6.2.1  MTH$RANDOM

          DEFINITION:

             SEED = (69069*SEED + 1) mod 2**32
             X = SEED/2**32
             returns real in range [0,1)

          RATING:
             Fails 1-D above 350,000 bpd     (bins per dimension)
             Fails 2-D above 600 bpd
             Fails 3-D above 100 bpd
             Fails 4-D above 27 bpd
             Fails 5-D above 15 bpd
             Fails 6-D above 9 bpd
             Fails 7-D above 7 bpd
             Fails 8-D above 4 bpd

          Comments:
              This generator is also used by the VAX FORTRAN intrinsic
              function RAN, and by the VAX BASIC function RND.



6.2.2  RANDU

          DEFINITION:
             SEED = (65539*SEED) mod 2**31
             X = SEED/2**31
             returns real in range [0,1)

          RATING:
             Fails 1-D above 200,000 bpd
             Fails 2-D above 400 bpd
             Fails 3-D above 3 bpd
             Fails 4-D above 6 bpd
             Fails 5-D above 6 bpd
             Fails 6-D above 4 bpd
             Fails 7-D above 3 bpd
             Fails 8-D above 3 bpd

          Comments:
              Note the extremely poor performance for dimensions 3 and
              above.  This generator is obsolete.



6.2.3  ANSI C ( rand() )

          DEFINITION:

             SEED = (1103515245*SEED + 12345) mod 2**31
             X = SEED
             returns integer in range [0, 2**31)

          RATING:
             Fails 1-D above 500,000 bpd
             Fails 2-D above 600 bpd
             Fails 3-D above 80 bpd
             Fails 4-D above 21 bpd
             Fails 5-D above 15 bpd
             Fails 6-D above 8 bpd
             Fails 7-D above 7 bpd
             Fails 8-D above 5 bpd



6.2.4  Microsoft C v4.0 rand()

          DEFINITION:

             SEED = (214013*SEED + 2531011) mod 2**31
             X = int( SEED/2**16 )
             returns integer in range [0, 2**15)

             Fails 1-D above 3000 bpd
             Fails 2-D above 700 bpd
             Fails 3-D above 84 bpd
             Fails 4-D above 16 bpd
             Fails 5-D above 15 bpd
             Fails 6-D above 7 bpd
             Fails 7-D above 6 bpd
             Fails 8-D above 4 bpd

          Comments:
              The poor performance for 1-d is partly due to truncating
              the lower 16 bits of SEED before returning X.  X is
              returned as a 15 bit positive signed integer.



6.2.5  Turbo Pascal v6.0 (random)

          DEFINITION:

             SEED = (134775813*SEED + 1) mod 2**32
             X = int(SEED/2**16)
             returns integer in range [0, 2**16)

             Fails 1-D above 6000 bpd
             Fails 2-D above 700 bpd
             Fails 3-D above 87 bpd
             Fails 4-D above 27 bpd
             Fails 5-D above 13 bpd
             Fails 6-D above 9 bpd
             Fails 7-D above 7 bpd
             Fails 8-D above 4 bpd

          Comments:
              The poor performance for 1-d is partly due to truncating
              the lower 16 bits of SEED before returning X.  X is
              returned as a 16 bit unsigned integer.

----------------------------------------------------------------------------



7.0  THE SPECTRAL TEST

As stated earlier, for a linear congruential random number generator,
a 2-D plot will show that all the points lie on parallel lines, and a
3-D plot will show that all the points lie on parallel planes.  In
general, an n-D plot would show that all the points lie on (n-1)
dimensional hyperplanes.

One test of a linear congruential random number generator is to see how
far apart these parallel planes are for each dimension.  The spectral
test calculates the distances between these parallel planes for a
linear congruential random number generator given the modulus M and the
multiplier A of the generator.  A good random number generator would
need to have the planes closely spaced for each dimension so that the
individual planes do not become apparent early on.  RANDU, having only
16 widely spaced parallel planes in the 3-D unit cube, fails the
spectral test quite horribly.

We will not discuss further the spectral test here.  There is a working
program SPECTRAL.FOR which you can play with.  Further information on
the spectral test can be found in the book by Knuth.

---------------------------------------------------------------------------



8.0  SHUFFLING

A simple way to greatly improve any random number generator is to
shuffle the output.  Start with an array of dimension around 100.  (The
exact size of array is not important.)  This is the array that will be
used to shuffle the output from the random number generator.
Initialize the array by filling it with random numbers from your random
number generator.  Then when the program wants a random number,
randomly choose one from the array and output it to the program.  Then
replace the number chosen in the array with a new random number fromthe random number generator.

Note that this shuffling method uses two numbers from the random number
generator for each random number output to the calling program.  The
first random number is used to pick an element from the array of random
numbers, and the second random number is used to replace the number
chosen.

The improvement in randomness is quite dramatic.  We tried this
shuffling technique with the ANSI C random number generator.  The
results are given below first without shuffling and then with shuffling
using an array size of 128 (a convenient power of 2):


    ANSI C rand()

         WITHOUT SHUFFLING:         WITH SHUFFLING:
    1-D  Fails above 500,000 bpd    Fails above 400,000 bpd
    2-D  Fails above 600 bpd        Fails above 3100 bpd
    3-D  Fails above 80 bpd         Fails above 210 bpd
    4-D  Fails above 21 bpd         Fails above 55 bpd
    5-D  Fails above 15 bpd         Fails above 24 bpd
    6-D  Fails above 8 bpd          Fails above 14 bpd
    7-D  Fails above 7 bpd          Fails above 9 bpd
    8-D  Fails above 5 bpd          Fails above 7 bpd


Notice the dramatic improvement in performance when shuffling is
used.  This shuffling technique works so well it may even fix up the
defective RANDU generator to the point where it is acceptable.  (Only a
lack of time prevented this test from being performed.)  This shuffling
technique is highly recommended for anyone who is even remotely
concerned about the quality of their random number generator.

Shuffling also takes care of the problem of points lying "mainly in the
planes" as was discussed earlier.  Points plotted on a unit square
appear random rather than lying on parallel lines, and points plotted
on a unit cube appear random rather than lying on parallel planes.  In
general points plotted on a unit n-dimensional hypercube no longer all
lie on parallel (n-1) dimensional hypercubes.

This shuffling technique is highly recommended.

---------------------------------------------------------------------------



9.0  DES AS A RANDOM NUMBER GENERATOR

The Data Encryption Standard (DES) is described by the reference "Data
Encryption Standard", 1977 January 15, Federal Information Processing
Standards Publication, number 46 (Washington: U.S.  Department of
Commerce, National Bureau of Standards).  This is the American National
Standard Data Encryption Algorithm X3.92-1981.

This algorithm was always intended to be implemented in hardware
because it is so complicated a software implementation would be too
slow for any practical use.  Nevertheless software implementations
exist and a VAX MACRO implementation is provided in file DES.MAR

DES can be set up as a random number generator by feeding the output
back into the input the same as with a liner congruential generator.
The DES algorithm returns a 64 bit quadword.  For the tests below the
low 32 bits was taken as one random unsigned longword and the high 32
bits was taken as another random unsigned longword.  Thus we got two
random numbers for each iteration.  Still it was incredibly slow.

Here are the results of testing DES as a 32 bit random number generator:

     1-D PASS 1,000,000 bpd, fails above
     2-D PASS      1100 bpd, fails above
     3-D PASS        90+bpd, highest tested due to slowness of DES
     4-D PASS        33+bpd, highest tested due to slowness of DES
     5-D PASS        16+bpd, highest tested due to slowness of DES
     6-D PASS         9+bpd, highest tested due to slowness of DES
     7-D PASS         7+bpd, highest tested due to slowness of DES
     8-D PASS         4 bpd, fails above

Note that DES is extremely slow.  DES takes an hour to generate the
same number of random variables a shuffled linear congruential
generator can generate in about a minute an a half.  And note a
shuffled linear congruential generator does better than DES.
---------------------------------------------------------------------------



10.0  HOW GOOD A RANDOM GENERATOR DO YOU NEED?

The exact determination of whether a particular random number generator
will be adequate for your needs is highly dependent upon your
application.  There is a trade off between resolution and quantity.  At
high resolution, it doesn't take very many random numbers before some
non-randomness can be detected.  At low resolution, it takes quite a
large number of random numbers before some non-randomness can be
detected.

The chi-square tests above give us an idea of where the tradeoff lies.
Each test used an expected value of 10 balls per bin, and each ball
required DIM random numbers per ball, where DIM is the dimension of the
chi-square test being performed.  The number of bins per dimension
determined the resolution required of the random numbers.

For example, the MTH$RANDOM number generator generates floating point
deviates between 0 and 1.  The floating point number returned is an
F_floating datum type with 24 bits of accuracy.  If we are doing a 1-D
chi-square test with 128 bins, we would convert the floating point
random number into an integer between 0 and 127 and then add 1.  This
represents a loss of 17 bits of resolution, since we essentially
truncate the low 17 bits of the F_floating datum and keep the upper 7
bits which represents a number between 0 and 127.  Because of this loss
of resolution, we can generate a lot more random numbers before any
non-randomness can be detected.

The chi-square tests above on the linear congruential generators
without shuffling were able to detect non-randomness in the generators
after about one million values were generated.  This value is somewhat
approximate, but it gives you an idea of the order of magnitude above
which you cannot expect the sequence to appear random.  If you only
need a small number of random numbers, say less than 100,000, if you do
not demand high resolution of the numbers, and if you're not concerned
about correlations between the initial SEED value and the first few
numbers from the generator, or about occasional correlations between
successive runs of the generator, or about the numbers all lying in
parallel lines and parallel planes, then the simple MTH$RANDOM or ANSI
C random number generators may be adequate for your needs.  Despite
it's numerous drawbacks, this simple method of random number generation
is quite adequate for many applications.

If you are concerned about the quality of your random numbers, or if
you need more than 100,000 random numbers, or if you want to make sure
there is no easy relation between the initial SEED value and the first
few numbers output from the generator, or if you want to make sure
there will be no relation between successive runs of the generator,
then you can shuffle the numbers using the technique in section 8
(Shuffling).  This produces excellent results, the amount of memory
required is quite small, and the small extra amount of time required is
usually quite trivial.

By shuffling the numbers you may be able to get a million or more
satisfactory random numbers from the MTH$RANDOM or ANSI C random number
generators.  If significantly more than a million random numbers are
needed, then a closer look at your application is warranted to
determine if the random number generator will be adequate for the
purpose.  But don't despair.  Just because 50 million shuffled random
numbers from the ANSI C generator would fail all of the chi-square
tests doesn't mean they won't be quite adequate for you application.

---------------------------------------------------------------------------


11.0  WHAT IS A RANDOM NUMBER GENERATOR?

The field of Information Theory is where we find a mathematical
understanding of what a random number generator really is.  Our ideal
random number generator is a zero-memory information source with
maximum entropy.  We think of a source as emitting a sequence of
symbols from a fixed finite set S = {s1, s2, ... sn}.  Successive
symbols are selected according to some fixed probability law.  For a
zero-memory source, the source remembers nothing about what previous
symbols have been emitted; each successive symbol emitted from the
source is completely independent of all previous output symbols.
Maximum entropy indicates that each symbol in the set S is
equally likely to occur.  Any other probability distribution would
result in a lower entropy, with some symbols being more likely than
others.  You may recall entropy is also used in thermodynamics to
measure the amount or "randomness" in a particular system.

A linear congruential random number generator is an example of a
first-order Markov process.  The next value depends entirely on
the previous value.  Even the Data Encryption Standard (DES) is nothing
more than a first-order Markov process.  A first-order Markov source
can emulate a zero-memory source up to a point.

A higher order Markov process has a better chance of emulating a
zero-memory process.  By shuffling the output, we have converted the
first-order Markov process to something like an Mth-order Markov
process where M is the size of the shuffling deck we use.  In this
case M pieces of information about previous outputs are kept instead of
just 1 piece of information.  Thus, because of the additional memory
used, the output can resemble more closely that of a zero-memory
source.  And indeed it does do a much better job.

For further understanding of zero-memory information sources and Markov
information sources consult a good book on Information Theory.  The
book by Abramson in the references below is an excellent one which also
gives an interesting account of how languages such as English, Spanish,
French, and German, can be synthesized using a low order Markov
process.

---------------------------------------------------------------------------



12.0  CONCLUSIONS

In this paper we have examined some probability theory, the concept of
a random sequence coming from a random process, and ways of analyzing a
sequence of numbers to determine how likely it is they came from a
random process.

We then examined the inside of a linear congruential random number
generator which is the most prevalent type of random number generator
in use today.  We dissected 5 commonly found linear congruential
generators, and then analyzed them using the chi-square and KS tests.
We found it was possible to detect non-randomness in each of these
generators after a few million iterations, except for the defective
RANDU generator, which failed the 3-D test after less than 800
iterations.

A technique for shuffling the random numbers was given and tested with
excellent results.  Shuffling the output of any random number generator
using the technique given is therefore highly recommended.

Analysis of the Data Encryption Standard (DES) as a random number
generator was also performed.  A software implementation of the Data
Encryption Standard DES was found to be too slow for any practical use
and the randomness of the results was found to be less than that
obtained by shuffling the output of a good linear congruential
generator.

Finally, a brief mathematical explanation of what an ideal random
number generator really is and how it differs from a computerized
random number generator was given.  For further reading on this topic,
consult a book on Information theory and read about the different types
of information sources defined there.

===========================================================================

REFERENCES:


NUMERICAL RECIPES: The Art of Scientific Computing

      William H. Press,
         Harvard-Smithsonian Center for Astrophysics
      Brian P. Flannery,
         EXXON Research and Engineering Company
      Saul A. Teukolsky,
         Department of Physics, Cornell University
      William T. Vetterling,
         Polaroid Corporation
      Cambridge University Press, 1986

         This book contains a wealth of knowledge along with source code
         examples on how to solve various kinds of programming problems.
         There are several programming language versions of this book
         available, including FORTRAN, Pascal, C, and BASIC.  Machine
         readable forms in various languages of all the sample routines
         are also available.



THE ART OF COMPUTER PROGRAMMING:  Vol 2. Seminumerical Algorithms

      Knuth, Donald E.
         Stanford University.
      2nd edition.  Reading, Mass.:  Addison-Wesley 1981.

      Knuth's 3 volume set on computer programming is the comprehensive
      source which all others turn to.



INFORMATION THEORY AND CODING

      Abramson, Norman
         Associate Professor of Electrical Engineering, Stanford University
      McGraw-Hill Book Company, Inc. 1963 McGraw-Hill electronic
      Sciences Series

      An excellent book on Information Theory.  Describes a zero memory
      source and an Nth order Markov process.  Also shows how language
      can be synthesized by an Nth order Markov process.  For example,
      you could synthesize something which sounded French, and if you
      didn't know French, you'd say it was French.  The same is true
      for random numbers; you can synthesize them.



THE C PROGRAMMING LANGUAGE

      Kernighan, Brian W.
      Ritchie, Dennis M.
      Englewood Cliffs, N.J. : Prentice Hall, c1978.
      2nd edition.  Englewood Cliffs, N.J. : Prentice Hall, c1988.

      Considered the book which defines the C programming language.



"A Revised Algorithm for the Spectral Test"  Algorithm AS 193.

      Hopkins, T.R.
         University of Kent at Canterbury, UK.
      'Applied Statistics' Vol 32, pg 328-335 1983

      This article gives a version of the spectral test coded in FORTRAN.
      The program modified for VAX/VMS is presented in SPECTRAL.FOR



THE GENERATION OF RANDOM VARIATES

      Newman, Thomas G.
         associate professor of mathematics Texas Tech University
      Odell, Patrick L.
         professor of mathematics and statistics Texas Tech University
      New York, Hafnew Publishing Company 1971
      Series Title: Griffin's statistical monographs & courses ; no. 29.



DISTRIBUTION SAMPLING FOR COMPUTER SIMULATION

      Lewis, Theodore Gyle
         University of Southwestern Louisiana
      Lexington, Massachusetts, D.C. Heath and Company 1975



ADVANCED ENGINEERING MATHEMATICS

      Erwin Kreyszig,
         Professor of Mathematics, Ohio State University, Columbus Ohio.
      5th edition.  John Wiley & Sons, 1983.
      Chapter 23: Probability and Statistics



PROBABILITY AND STOCHASTIC PROCESSES FOR ENGINEERS

      Carl W. Helstrom
         Department of Electrical Engineering and Computer Sciences,
         University of California, San Diego.
      New York, Macmillan Publishing Company, 1984

      Advanced probability theory.  College or graduate level.



PROBABILITY AND STATISTICS

      Edwards, Allen L.
         Professor of Psychology, University of Washington
      New York.  Holt, Rinehart and Winston, Inc.  1971

      A first course in probability and statistics, applied statistics
      and experimental design.



DETECTION OF SIGNALS IN NOISE

      Whalen, Anthony D.
         Bell Telephone Laboratories
      New York, Academic Press, Inc. 1971

      A graduate level book.  Chapter 1 is a review of probability and
      chapter 2 describes random processes.



NOISE AND ITS EFFECT ON COMMUNICATION
   Blachman, Nelson M.
      Sylvania Electronic Systems, Mountain View, California
   McGraw-Hill Book Company  1966
   McGraw-Hill Electronic Sciences Series



ENCYCLOPEDIA OF COMPUTER SCIENCE
   edited by Anthony Ralston
   Petrocelli (New York, 1976)
   See:
     Random Number Generation (pp. 1192-1197)
      by G. Marsaglia