[erlang-bugs] Unstated limits of random:uniform/1
Michael Truog
mjtruog@REDACTED
Fri Feb 28 07:25:10 CET 2014
Python code comments claim to have 27814431486575 as the max value to be expected from the Wichmann-Hill 1982 generator (http://svn.python.org/projects/python/trunk/Lib/random.py). They also make the same mistake with floating point numbers and they do not enforce the max random number, so they have the same problem. However, based on the algorithm and anywhere else its limitation is mentioned, the max value is 27817185604309 as mentioned previously in the email thread. The random numbers are low-quality (repeats every 2.78e13). However, they are using less than 27 bits for some of the operations, so they are at least using real integers underneath for some of the operations in the random module Erlang code. So, for quick Erlang random numbers where a 2.78e13 period is sufficient, using the random module is best (avoiding the crypto port driver), assuming you only need 44 bits or less.
When I last dug into crypto its period is different based on how OpenSSL is configured, and it is difficult to determine based on the source code. So, my numbers are probably wrong, but what I have is: "crypto:strong_rand_bytes/1 repeats in the same way as RAND_bytes within OpenSSL. crypto:rand_bytes/1 repeats in the same way as RAND_pseudo_bytes within OpenSSL. if OpenSSL is configured to use the MD PRNG (default) with SHA1 (in openssl/crypto/rand/md_rand.c), the collisions are between 2^80 and 2^51 (http://eprint.iacr.org/2008/469.pdf). So, that means "weak" would repeat ideally every 1.21e24 and at worst every 2.25e15. if OpenSSL was compiled in FIPS mode, it uses ANSI X9.31 RNG and would have collisions based on 3DES.". OpenSSL should have more generation options now (I looked into this in July of 2012). The worst (period) performance of SHA1 made me learn to like the random module for things that don't deserve the use of crypto/OpenSSL, due to the port driver usage
(2.78e13 is relatively close to 2.25e15). I would love it if someone could find a reference for how dependable the OpenSSL random numbers really are, since there should be many.
I wish we could have a guard enforce the limit in the random module to avoid its abuse. However, other programming languages have made this mistake with the 1982 generator (at least Python). We also seem to need crypto to give us a floating point number, with code similar to what is below:
strong_float() ->
% 53 bits maximum for double precision floating point representation
Bytes = 7, % erlang:round(53.0 / 8), % bytes for random number
MaxRand = 72057594037927940, % math:pow(2, 7 * 8) - 1, % max random number
binary:decode_unsigned(crypto:strong_rand_bytes(Bytes)) / MaxRand.
part of the reason I created https://github.com/okeuday/quickrand is to avoid these problems
On 02/27/2014 09:14 PM, Kenji Rikitake wrote:
> random:uniform/1 is defined as (in lib/stdlib/src/random.erl)
>
> uniform(N) when is_integer(N), N >= 1 -> trunc(uniform() * N) + 1.
>
> random:uniform/0 is within [0, 1), and is an IEEE 754 double precision
> float, which only has 53 significand bits. So giving anything larger
> than 2^53 to N will give no additional meaningful resolution other
> than the 53 MSBs of the result.
>
> And as in
>
> ++> Michael Truog <mjtruog@REDACTED> [2014-02-11 07:28:22 -0800]:
>> 27817185604309 == 30269 * 30307 * 30323 is the highest N should ever be, otherwise the result is garbage.
> the output resolution is slightly less than 2^45 (27817185604309 = ~
> 2^(44.66)).
>
> I wonder, however, how many of the bits in random:uniform/0 really has
> the meaningful randomness, since it is a simple multiplying
> congruential generator. Should N be allowed as high as 2^44? I don't
> think so.
>
> Reference:
>
> B. A. Wichmann and I. D. Hill,
> Algorithm AS 183: An Efficient and Portable Pseudo-Random Number
> Generator,
> Journal of the Royal Statistical Society. Series C (Applied
> Statistics), Vol. 31, No. 2 (1982), pp. 188-190.
>
> Kenji Rikitake
>
>> On 02/11/2014 06:47 AM, Kostis Sagonas wrote:
>>> The documentation of random:uniform/1 reads:
>>>
>>> uniform(N) -> integer() >= 1
>>>
>>> Types:
>>>
>>> N = integer() >= 1
>>>
>>> Given an integer N >= 1, uniform/1 returns a random integer uniformly distributed between 1 and N, updating the state in the process dictionary.
>>>
>>> and from it a (naive?) Erlang programmer would assume that it works for Erlang integers. However, apparently there is an (unstated) upper limit.
>>>
>>> ========================================================================
>>> Eshell V6.0 (abort with ^G)
>>> 1> random:uniform(1 bsl 42).
>>> 1950905779137
>>> 2> random:uniform(1 bsl 1023).
>>> 64990220693815292632299777453770053245701880982584490305757715776780176648584151835529728245903303858071465265235635364507930685677056366431569479144084789774752709050314473717035731429737215919311815680621634352115003928201262448305879457258028874562676857884269587024825648343920396535221283000212783104001
>>> 3> random:uniform(1 bsl 1024).
>>> ** exception error: an error occurred when evaluating an arithmetic expression
>>> in function random:uniform/1
>>> in call from erl_eval:do_apply/6
>>> in call from shell:exprs/7
>>> in call from shell:eval_exprs/7
>>> in call from shell:eval_loop/3
>>> in call from prim_file:set_cwd/2
>>> ========================================================================
>>>
>>> Minimally, the published documentation (and the types of all functions of this module) has to be updated to explicitly mention this limit.
>>>
>>> Ideally, the implementation has to change to avoid use of floats when manipulating Erlang integers. IMO, it does not really have to do this and result in crashes like that.
>>>
>>> Kostis
More information about the erlang-bugs
mailing list