[erlang-questions] Speedy unsort:shuffle/1,2 ?

Pascal Chapier <>
Fri Jun 4 15:06:49 CEST 2010


Hello,

I have tried a solution which is quite efficient, and I got good results in distribution. But I have trouble with the performance results,
particularly the 2 last values in the table bellow (is there a known bug in erlang:statistics function?). My intent was to evaluate the
relationship between length and shuffle time.
I did a new test, and got extravagant results again, but for other list length - the time for 1 list cannot be 4293323265µs since the program did
the 14 * 1000 loops of test/0 in less than 6 hours -. My configuration is R13B04 on windows XP.

The risk of this method is to get two identical numbers in the tupple_list. I can't evaluate the impact on randomless,
in this case, keysort simply left the elements in original order.

The quality of the method depends on the random:uniform function, maybe some other function could be faster and/or better.

Here is the code I used:

shuffle(P) ->
    Max = length(P)*10000,
    {_,R}= lists:unzip(lists:keysort(1,[{random:uniform(Max),X} || X <- P])),
    R.

test() ->
    test:perfshuffle(shuffle,125,1000),
    test:perfshuffle(shuffle,250,1000),
    test:perfshuffle(shuffle,500,1000),
    test:perfshuffle(shuffle,1000,1000),
    test:perfshuffle(shuffle,2000,1000),
    test:perfshuffle(shuffle,4000,1000),
    test:perfshuffle(shuffle,8000,1000),
    test:perfshuffle(shuffle,16000,1000),
    test:perfshuffle(shuffle,32000,1000),
    test:perfshuffle(shuffle,64000,1000),
    test:perfshuffle(shuffle,128000,1000),
    test:perfshuffle(shuffle,256000,1000),
    test:perfshuffle(shuffle,512000,1000),
    test:perfshuffle(shuffle,1024000,1000),
    test:testshuffle(shuffle,10,1000000).


testshuffle() ->
    io:format("Usage: testshuffle(ShuffleFunctionName,SizOfList,NumberOfTrial)~n").

testshuffle(F,L,T) ->
    S = lists:seq(0,L),
    Nul = [X-X || X <- S],
    {Res,Car} = testshuffle(F,Nul,Nul,S,L,T),
    Moy = [X/T || X <- Res],
    Sig = [math:sqrt(X/T) || X <- Car],
    Tm = L/2,
    Td = math:sqrt((L*(L+2))/12),
    io:format("Theoritical results: average = ~p, deviation = ~p~n",
              [Tm,Td]),
    io:format("Average   :~p~n",[Moy]),
    io:format("Deviation :~p~n",[Sig]).

testshuffle(_,Am,Ac,_S,_L,0) ->
    {Am,Ac};
testshuffle(F,Am,Ac,S,L,I) ->
    Ns = ?MODULE:F(S),
    Am1 = [X+Y || {X,Y} <- lists:zip(Am,Ns)],
    Ac1 = [X+math:pow(Y-L/2,2) || {X,Y} <- lists:zip(Ac,Ns)],
    testshuffle(F,Am1,Ac1,S,L,I-1).

perfshuffle() ->
    io:format("Usage: perfshuffle(ShuffleFunctionName,SizOfList,NumberOfTrial)~n").
    
perfshuffle(F,L,T) ->
    S = lists:seq(0,L),
    statistics(runtime),
    statistics(wall_clock),
    perfshuffle1(F,S,T),
    {_, Time1} = statistics(runtime),
    {_, Time2} = statistics(wall_clock),
    T1 = 1000 * Time1/T,
    T2 = 1000 * Time2/T,
    io:format("Average Shuffle Time for a list of ~p elements~nRuntime   -> ~p~nWallclock -> ~p~n",
                [L,T1,T2]).

perfshuffle1(_F,_S,0) ->
    ok;
perfshuffle1(F,S,N) ->
    _ = ?MODULE:F(S),
    perfshuffle1(F,S,N-1).
    
And the results (performance in µs):    
    

Average Shuffle Time for a list of 125 elements
Runtime   -> 281.0
Wallclock -> 281.0
Average Shuffle Time for a list of 250 elements
Runtime   -> 422.0
Wallclock -> 422.0
Average Shuffle Time for a list of 500 elements
Runtime   -> 891.0
Wallclock -> 906.0
Average Shuffle Time for a list of 1000 elements
Runtime   -> 2109.0
Wallclock -> 2110.0
Average Shuffle Time for a list of 2000 elements
Runtime   -> 4328.0
Wallclock -> 4328.0
Average Shuffle Time for a list of 4000 elements
Runtime   -> 8438.0
Wallclock -> 8515.0
Average Shuffle Time for a list of 8000 elements
Runtime   -> 16828.0
Wallclock -> 17172.0
Average Shuffle Time for a list of 16000 elements
Runtime   -> 38234.0
Wallclock -> 40141.0
Average Shuffle Time for a list of 32000 elements
Runtime   -> 85750.0
Wallclock -> 90125.0
Average Shuffle Time for a list of 64000 elements
Runtime   -> 183688.0
Wallclock -> 192891.0
Average Shuffle Time for a list of 128000 elements
Runtime   -> 442297.0
Wallclock -> 4.735e5
Average Shuffle Time for a list of 256000 elements
Runtime   -> 990390.0
Wallclock -> 1058078.0
Average Shuffle Time for a list of 512000 elements
Runtime   -> 4293323265.0
Wallclock -> 2907109.0
Average Shuffle Time for a list of 1024000 elements
Runtime   -> 1630126.0
Wallclock -> 6554031.0
Theoritical results: average = 5.0, deviation = 3.1622776601683795
Average   :[4.996671,4.996889,4.997597,5.003431,5.000648,4.997709,5.005808,
            4.999759,4.996688,5.003346,5.001454]
Deviation :[3.164504542578506,3.161651941627984,3.1614915783534836,
            3.1614669063585024,3.162072737936305,3.1621861109049227,
            3.160950173602868,3.165070457351621,3.1611447293662467,
            3.160723651317843,3.1637879195673024]

Pascal
 		 	   		  
_________________________________________________________________
Hotmail : Simple et Efficace qui vous facilite la vie… Découvrez la NOW génération !
http://www.windowslive.fr/hotmail/nowgeneration/


More information about the erlang-questions mailing list