by Neal R. Wagner
Copyright © 2001 by Neal R. Wagner. All rights reserved.
NOTE: This site is obsolete. See:
The iterative equation
f(x) = 4x(1 - x), 0 <= x <= 1,
known as the logistic equation, is historically interesting as one of the earliest proposed sources of pseudo-random numbers. Ulam and von Neumann suggested its use in 1947, partly because it had a ``known algebraic distribution''. This means that even though the sequence of numbers generated by repeatedly applying the function to itself is not uniformly distributed, an algebraic transformation will give the uniform distribution. The equation was mentioned again in 1949 by von Neumann, and much later in 1969 by Knuth, but it was never used for random number generation.
Another function closely related to the logistic equation is the ``tent'' function defined by
g(u) = 1 - 2*abs(u - 1/2), 0 <= u <= 1.
This function does directly yield numbers that are uniformly distributed, and f can be transformed to g using the equation:
T(u) = sin2((Pi/2)*u).
The inverse transformation is:
T-1(v) = (2/Pi)*arcsin(sqrt(v)).
Thus the sequence
{T-1(fn(x))}, for n = 0, 1, 2, 3, ...will be uniformly distributed for ``almost all'' starting values x, where f2(x) means f(f(x)), f3(x) means f(f(f(x))), and so forth.
The above equations work to give uniformly distritrbuted sequences of real numbers if one could use what is called ``infinite precision'' real numbers, that is, mathematical reals. Even in this case, the sequence {T-1(fn(x))} does not at all behave like a true random sequence. A cycle occurs in the sequence {fn(x)} in case it repeats after finitely many iterations. There are infinitely many finite cycles like this, even though ``almost all'' real numbers will not belong to such a cycle. For example, 3/4 is transformed to itself by f. Nevertheless, if one starts with an x value very close to 3/4, successive values will also be close to 3/4 (though each new value not as close), so the sequence is definitely not always random-looking. One can get more random-looking numbers by interating the sequence and letting the high-order bits fill up with noise.
Using actual computer floating point hardware that implements the IEEE float or double type (32 bits, about 7 digits of precision, or 64 bits, about 16 digits of precision), the behavior of these functions is quite different from the behavior in infinite precision. The tent function converges almost immediately to zero, since each iteration add another low-order 0 bit. The actual logistic equation acts a little better, but still has a relatively short initial run followed by a realtively short cycle. For a float one can try out all possible starting values to see the cycle lengths that occur and their percent occurrence: (1, 93%), (930, 5.6%), (432, 1%), (106, .35%), (205, 0.1%), (5, 0.002%), (4, 0.0004%), (3, 0.00005%), and (1, 0.00002%). Notice that after an initial run of a few thousand values, the function falls into the cycle of length 1 (the cycle that maps 0 to itself) 93% of the time. For a double, I could only try random starting points and record the percentages, but after a run about 10 000 000 long, the function falls into the same cycle of length 1 about 15% of the time. For other initial values the cycles are of lengths from a few million to 5, 10 or 15 million values.
At this point, the logistic equation would seem useless for random number generation, since it has non-random behavior and often falls into a stationary point. However, I came up with two ideas to make it useful.
The logistic equation yields numbers very close to 0 (on the positive side) and very close to 1. Available floating point numbers ``pile up'' near 0, but there is no similar behavior near 1. I was able to restructure the equation so that values occurring near 1 are re-mapped to the negative side of 0. The following definition does this, mapping [-1, 1] to itself (still called f because is it essentially the same):
f(x) = 2*abs(x)*(2 - abs(x)), for abs(x) <= beta, f(x) = -2*(1 - abs(x))2, for beta <= abs(x) <= 1
Here beta = 1 - 1/sqrt(2) = 0.292893218813452476. In infinite precision, this re-mapped equation behaves exactly like the original, but with floating point numbers there is no longer any convergence to the cycle of length 1. Besides this cycle, the other cycles have lengths about 5 or 10 times as long. A slightly different function will again transform these numbers to the uniform distribution:
S(x) = (2/Pi)*arcsin(sqrt(x/2)), for 0 <= x < 1, S(x) = (2/Pi)*arcsin(sqrt(-x/2)) + 0.5, for -1 <= x < 0.
This equation does much better, but it is still not remotely good enough. The final piece of the puzzle is to combine multiple logistic equations into a lattice and couple them weakly together.
Researchers in chaos theory often use a lattice of numbers to which the logistic equations are applied. Adjacent lattice nodes affect one another to a small degree. This model is a greatly simplified version of a fluid undergoing turbulence. In the 1-dimensional case, the nodes are an array of m numbers:
x[0], x[1], x[2], . . . , x[m-1]
The equation applied to each node i is:
xnew[i] = (1 - 2*v)*f(x[i]) + v*(f(x[i-1]) + f(x[i+1])),
where xnew[i] is the new value at each node, v is a special viscosity constant, and the calculation of i-1 and i+1 is done modulo m, that is, by dividing by m and taking the remainder. In effect, this wraps the linear array of nodes into a circle.
In two dimensions, the equations take the form:
xnew[i][j] = (1 - 4*v)*f(x[i][j]) +
v*(f(x[i][j-1]) + f(x[i][j+1]) +
f(x[i-1][j]) + f(x[i+1][j])),
Here arithmetic with both subscripts is carried out modulo m. In both the 1- and 2-dimensional versions above, the numbers should be doubles, and the constant v should be small, so as not to disturb the uniform distribution and to promote more turbulence. I have used 10-13 for this constant. It is necessary to iterate the equations before outputting a value; I have used 120 iterations. The sizes should be at least size 7 in one dimension and size 3-by-3 in two dimensions.
If the initial values are symmetric about some axis, then the lattice will repeat as if there were just single logistic equations, so it would be best to use another random number generator to initialize the lattice values. The individual nodes should be independent of one another, so that this will produce m or m2 random reals at a time, depending on whether it is 1- or 2-dimentional.
If one is worried about a symmetric set of values coming up (a highly improbable occurrence), one could use a variation of the equations that is not symmetric, such as:
xnew[i][j] = (1 - 4*v)*f(x[i][j]) +
v*(1.1*f(x[i][j-1]) + 0.9*f(x[i][j+1]) +
1.2*f(x[i-1][j]) + 0.8*f(x[i+1][j])),
Here is a view of the 2-dimensional lattice of size 3-by-3, showing the actual nodes in black italic, and adjacent nodes in red when they are calculated using modular arithmetic. There are only 9 nodes, but the others appear to be adjacent. As another more mathematical way of visualizing this lattice, if the left and right sides are pasted together, the lattice would form a vertical hollow cylinder. Then if the top and bottom sides are pasted together, it would form a donut-shaped object (called a torus by mathematicians). (The picture is similar to the old ``Pac Man'' games, where the Pac Man would exit on one side and immediately come in from the other side.)
| Logistic Lattice | ||||||
|---|---|---|---|---|---|---|
| t[1][1] |
t[1][2] |
t[1][0] |
t[1][1] |
t[1][2] |
t[1][0] |
t[1][1] |
| t[2][1] |
t[2][2] |
t[2][0] |
t[2][1] |
t[2][2] |
t[2][0] |
t[2][1] |
| t[0][1] |
t[0][2] |
t[0][0] | t[0][1] | t[0][2] | t[0][0] |
t[0][1] |
| t[1][1] |
t[1][2] |
t[1][0] | t[1][1] | t[1][2] | t[1][0] |
t[1][1] |
| t[2][1] |
t[2][2] |
t[2][0] | t[2][1] | t[2][2] | t[2][0] |
t[2][1] |
| t[0][1] |
t[0][2] |
t[0][0] |
t[0][1] |
t[0][2] |
t[0][0] |
t[0][1] |
| t[1][1] |
t[1][2] |
t[1][0] |
t[1][1] |
t[1][2] |
t[1][0] |
t[1][1] |
The pseudo-random number generator based on this lattice seems to be a very good source of random numbers, but from the nature of this theory, it is not possible to prove results about the probable length of cycles or about the quality of its random behavior. It seems likely that for almost all starting values (nine doubles), the generator will not cycle for a very long time. It has been tested for random initial values and did not cycle for billions of iterations. The numbers produced gave good results when subjected to statistical tests. Nevertheless, the ``perfect'' generators of the previous section are to be preferred to this one.
The Java implementation did not cause any particular problems. Here is Java code giving the 3-by-3 lattice. Logistic Lattice in Java.