[erlang-questions] Mersenne Twister in Erlang

Wed Jun 6 12:53:12 CEST 2007

Hi folks,

The Erlang port of the Mersenne Twister pseudo-random number 
generator[1] previously posted on this mailing list[2] seems to no 
longer be available so I ported it again, the work of an evening or two. 
For posterity and Googlability in the mailing list archives I've 
attached it to this mail; if anyone would like to take a look at it and 
suggest further refinements to my implementation I'd not complain.

The module includes a compatibility interface to stdlib:random so you 
should be able to drop it into your code with just s/random:/random_mt:/ 
(although it will generate different numbers, of course). The 
random_mt_test.erl file in the tarball generates the same output when 
run as the mtTest.c file in the Mersenne Twister distribution so you can 
check the correctness of the implementation.

My first attempt at porting used a binary in an exactly analogous way to 
the original C code (the comments from the C are still embedded in this 
final version if anyone wants to compare). The result was roughly 0.6 
times the speed of the random module in stdlib (my benchmark was 
random-numbers-generated-per-second), already better than the previous 
implementation (~0.33*stdlib:random) but compiling with [native] made it 
even slower.

I replaced the binary with a tuple, a relatively straightforward change, 
reasoning that at least it still had O(1) access time. The result was 
something that at least got faster when compiled with [native], but both 
the native and non-native versions were slower than with binaries!

Finally I figured there was no getting around the large update times of 
tuples and binaries, whatever the order of their access times. Since the 
Mersenne Twister algorithm regenerates the entire 624-element array one 
element at a time whenever it runs out of state (once every 624 calls to 
genrand/1), I was rebuilding my binary or tuple over 600 times when 
really I should be able to build it incrementally. Most of the accesses 
to the array during the rebuilding are sequential, so with a bit of 
messing around I was able to use lists, and the performance leapt up - 
roughly on a par with stdlib:random (or a bit faster) when interpreted, 
and three times faster than stdlib:random when compiled with [native]. 
This corresponds to 3,000,000 random numbers per second on my 2GHz 
Athlon64. (Not to mention better quality random numbers than the 
stdlib:random module.)

Hope other people find this useful,
Tim.

[1] http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/emt.html
[2] 
http://www.erlang.org/pipermail/erlang-questions/2001-September/003579.html

-- 
Tim Bates
tim@REDACTED
-------------- next part --------------
A non-text attachment was scrubbed...
Name: random_mt.tar.gz
Type: application/gzip
Size: 3691 bytes
Desc: not available
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20070606/73cf4fe7/attachment.bin>