[erlang-patches] Fix a bug in the implementation of the pseudo-random number generator

Wed Oct 6 04:32:49 CEST 2010

[ sorry for the very long mail ]

On Wed, 6 Oct 2010 04:19:11 +0200
Cristian Greco <cristian@REDACTED> wrote:

> Hi,
> 
> the current implementation of the Wichmann-Hill prng contains an
> error, which is fixed in the following branch.
> 
>   git fetch git@REDACTED:cristiangreco/otp.git cg/fix-prng
> 
> More details provided in the following mail.
> 
> 
> This commit fixes an error in the mathematical formula of the
> Wichmann-Hill pseudo-random number generator. In particular, the
> implementation used until now produces sequences which differ from the
> expected ones by an extra starting number, which is instead the very
> last value of the sequence. This bug amplified the effect of extremely
> correlated initial numbers when seeding different generators with very
> similar seed values.

The first number emitted from the functions in module 'random' (either
via uniform/0, uniform/1, uniform_s/1 or uniform_s/2) is not the first
number of the real sequence generated by the Wichmann-Hill prng.

This kind of generator is a combination of three simple ones (LCGs):

X_i = (171 * X_i-1) mod 30269
Y_i = (172 * Y_i-1) mod 30307
Z_i = (170 * Z_i-1) mod 30323

with

U_i = (X_i / 30269 + Y_i / 30307 + Z_i / 30323) mod 1

while the actual implementation in random.erl is something like:

U_i = (X_i-1 / 30269 + Y_i-1 / 30307 + Z_i-1 / 30323) mod 1

Please compare the following outputs: the first _6_ float numbers
emitted in Erlang with the default seed of {3172, 9814, 20125},

1> [ random:uniform() || _ <- lists:seq(1,6) ].  
[0.09230089279334841,0.4435846174457203,0.7230402056221108,
 0.94581636451987,0.5014907142064751,0.311326754804393]

with the first _5_ floats in Python,

$ python -c "import random; r = random.WichmannHill(); r.setstate((1,
(3172, 9814, 20125), None)); print [ r.random() for n in range(0,5) ]"
[0.44358461744572031, 0.7230402056221108, 0.94581636451986995,
0.50149071420647506, 0.31132675480439298]

or in R,

> .Random.seed <- c(400L, 3172L, 9814L, 20125L)
> runif(5)  
[1] 0.4435846 0.7230402 0.9458164 0.5014907 0.3113268

where the first element of the vector in .Random.seed is the code of
the WH prng. It is clear that the sequence of values produced by the
implementation in Erlang is somehow shifted compared to that produced
by other implementation. We get the same results sampling random
integers in (1, 1000000) from the generator:

Erlang:
1> [ random:uniform(1000000) || _ <- lists:seq(1,6) ].  
[92301,443585,723041,945817,501491,311327]

Python:
$ python -c "import random; r = random.WichmannHill(); r.setstate((1,
(3172, 9814, 20125), None)); print [ r.randint(1,1000000) for n in
range(0,5) ]"
[443585, 723041, 945817, 501491, 311327]

R:
> .Random.seed <- c(400L, 3172L, 9814L, 20125L)
> sample(1:1000000, 5, replace=T)  
[1] 443585 723041 945817 501491 311327

To be more precise, given the math involved in the Wichmann-Hill
generator and the kind of error of the implementation, the first number
emitted in Erlang is not completely unrelated to the real sequence
generated by the seed {3172, 9814, 20125}. Indeed it is the _very last_
number of the sequence, as demonstrated by:

$ python -c "import random; r = random.WichmannHill(); r.setstate((1,
(3172, 9814, 20125), None)); r.jumpahead(6953607871643); print
r.random()"
0.0923008927933

where the jumpahead(N) function from the Python library changes the
internal state of the generator by "jumping" N states away from the
current one, while the number passed as argument is `P - 1`, where P =
6953607871644 ~ 6.95e+12 is the period of this generator. For the sake
of completeness, the following command prints the state of the
generator before and after emitting the last number of the sequence,
and then loops again:

$ python -c "import random; r = random.WichmannHill(); r.setstate((1,
(3172, 9814, 20125), None)); r.jumpahead(6953607871643); print
r.getstate(); print r.random(); print r.getstate(); print
r.random()"
(1, (21968, 17325, 20631), None)
0.0923008927933
(1, (3172, 9814, 20125), None)
0.443584617446

and it is worth noting that:

1> F = fun({A1,A2,A3}) -> {(A1*171) rem 30269, (A2*172) rem 30307,  
(A3*170) rem 30323} end.
#Fun<erl_eval.6.13229925>
2> F({21968, 17325, 20631}).  
{3172,9814,20125}

I ran some tests using the dieharder tool and it _seems_ this bug does
not considerably affects the randomness of the generated sequence; in
any case, I'm not a theorist so can't judge any possible weakness of
this implementation.

On the other hand, this error clearly explains a common misconception
about the pseudo-random number generator in Erlang. Initializing a
number of parallel generators (here I mean concurrent processes, not
independent parallel streams) using very similar seeds from subsequent
calls to erlang:now/0 produces sequences whose first numbers are
extremely correlated (e.g. slightly increasing values, or even
identical values for sufficiently small values of N in
random:uniform/1) but then seem to "diverge pretty rapidly". The
problem here is that the first number within each sequence is not the
first one emitted by the prng, it is instead just like a "poor hash" of
the starting seeds.

This behavior seems to be well known and has been discussed on the
lists:

http://www.erlang.org/cgi-bin/ezmlm-cgi?4:53165:201009
http://www.erlang.org/cgi-bin/ezmlm-cgi?4:39751:200811
http://www.erlang.org/cgi-bin/ezmlm-cgi?4:39752:200811
http://www.erlang.org/cgi-bin/ezmlm-cgi?4:39756:200811

This is an example using a fixed module called 'fixrand':

1> S = lists:map(fun() -> {A,B,C} = erlang:now(), random:seed(A,B,C),  
io:format("~p~n", [[random:uniform(10) || _ <- lists:seq(1,10)]]),
{A,B,C} end, lists:seq(1,5)).
[4,1,2,2,10,9,1,7,4,6]
[5,7,2,4,10,9,3,9,8,2]
[5,4,2,2,9,10,9,8,10,10]
[5,1,5,9,9,2,6,6,6,10]
[5,6,4,9,4,8,10,9,4,10]
[{1286,310211,426712},
 {1286,310211,428248},
 {1286,310211,428375},
 {1286,310211,428497},
 {1286,310211,428585}]
2> lists:map(fun({A,B,C}) -> fixrand:seed(A,B,C), io:format("~p~n",  
[[fixrand:uniform(10) || _ <- lists:seq(1,10)]]) end, S).
[1,2,2,10,9,1,7,4,6,9]
[7,2,4,10,9,3,9,8,2,4]
[4,2,2,9,10,9,8,10,10,8]
[1,5,9,9,2,6,6,6,10,5]
[6,4,9,4,8,10,9,4,10,10]
[ok,ok,ok,ok,ok]

As a final remark, please note that a fixed version of the Wichmann-Hill
generator, although still considered acceptable nowadays, indeed shows
the same flaw when seeding with very very similar values and sampling
from small integer ranges, for example:

1> S = [ erlang:now() || _ <- lists:seq(1,5) ].  
[{1286,310474,823741},
 {1286,310474,823744},
 {1286,310474,823746},
 {1286,310474,823748},
 {1286,310474,823750}]
2> lists:map(fun({A,B,C}) -> fixrand:seed(A,B,C), io:format("~p~n",  
[[fixrand:uniform(10) || _ <- lists:seq(1,10)]]) end, S).
[5,2,10,8,7,6,9,6,1,7]
[5,10,10,1,8,5,4,7,2,6]
[5,9,1,7,5,5,10,2,9,5]
[5,8,1,2,3,4,6,6,6,4]
[5,7,2,8,10,4,2,10,3,3]
[ok,ok,ok,ok,ok]
3> lists:map(fun({A,B,C}) -> fixrand:seed(A,B,C), io:format("~p~n",  
[[fixrand:uniform(10000) || _ <- lists:seq(1,10)]]) end, S).
[4284,1169,9327,7355,6266,5247,8976,5920,525,6819]
[4452,9761,9994,714,7380,4684,3276,6982,1104,5266]
[4564,8823,439,6287,4789,4309,9477,1024,8157,4230]
[4676,7884,883,1860,2199,3934,5677,5065,5209,3195]
[4788,6946,1328,7433,9609,3558,1877,9107,2262,2159]
[ok,ok,ok,ok,ok]

or when using seeds with congruent numbers, such as in the following
example, where 510934 = 450288 mod 30323

http://www.erlang.org/cgi-bin/ezmlm-cgi?4:39986:200811

This bug seems to be around since at least 11 years, which is the
release year of the oldest available official release (R6B-0, released
1999) at http://erlang.org/download.

Thanks,
--
Cristian Greco
GPG key ID: 0xCF4D32E4
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: <http://erlang.org/pipermail/erlang-patches/attachments/20101006/8c3ffbc3/attachment.bin>