Tail/Non tail recursion

Fri Aug 29 09:12:27 CEST 2003

Here are the results from the benchmark tool from the efficiency guide:

               R7B01  R8B   R9B   R9C
-------------------------------------
square plain  1.22  1.01  1.01  1.00
square lcomp  1.34  1.01  1.03  1.01
square tail   1.23  1.05  1.02  1.03
square map    1.70  1.54  1.48  1.48

On my old Sun Ultra 10 for lists of length 30000. The results are 
normalized execution time.

Vlad Dumitrescu's simple benchmark indicated something else, and 
demonstrated that benchmarks are a bit tricky. The benchmark tool from 
the efficiency guide does what it can to eliminate errors. For example 
are all tests run in a new process so they will have to do the garbage 
collection in the same way. In Vlad's benchmark the first test made the 
heap grow so the subsequent tests had a sufficiently big heap to work on 
and could run almost without garbage collections.

Here's the code. The benchmark tool produces a neat .html page with the 
test results. Try it - you may get addicted. You may even get reliable 
results.

=============================================================
-module(square_bm).
-include("bench.hrl").
-export([benchmarks/0]).
-export([bm_square_plain/1,bm_square_tail/1,
	 bm_square_lcomp/1, bm_square_map/1]).

benchmarks() ->
     {100, [bm_square_plain,bm_square_tail,bm_square_lcomp,bm_square_map]}.

bm_square_plain(Iter) ->
     iterate(Iter, fun square_plain/1).

bm_square_tail(Iter) ->
     iterate(Iter, fun square_tail/1).

bm_square_lcomp(Iter) ->
     iterate(Iter, fun square_lcomp/1).

bm_square_map(Iter) ->
     iterate(Iter, fun square_map/1).

iterate(Iter, Fun) ->
     List = lists:seq(1, 30000),
     iterate(Iter, Fun, List).

iterate(0, _Fun, _List) ->
     ok;
iterate(Iter, Fun, List) ->
     Fun(List),
     iterate(Iter-1, Fun, List).

square_plain([]) -> [];
square_plain([H|T]) -> [H*H|square_plain(T)].

square_tail(L) -> square_tail(L, []).
square_tail([], R) -> lists:reverse(R);
square_tail([H|T], R) -> square_tail(T, [H*H|R]).

square_lcomp(L) -> [X*X || X <- L].

square_map(L) ->
     lists:map(fun(X) -> X*X end, L).
=============================================================

Short user's guide:

Place the above benchmark file square_bm.erl in the same directory as 
the benchmark tool's two files all.erl and bench.erl.

Edit all.erl at the end to point out your installation(s) of erlang.

Move all *_bm.erl files to some other directory (e.g a subdirectory 
./DISABLED) that you do not want to be run in the benchmark.

Start an Erlang command shell and give the commands:
c(all).
all:run().

Exit the Erlang command shell and edit square_bm.erl to change 
iterations and/or list length to get run times of a few seconds per 
test. Since everyone in the world probably has got faster machines than 
I you will have to increase the numbers.

Re-enter the Erlang command shell and give the command:
all:run().

Iterate the last two paragraphs until satisfied. Have a look with your 
web browser on the file index.html to see the results.

/ Raimo Niskanen, Erlang/OTP, Ericsso AB.

Ingela Anderton wrote:
> Luke Gorrie wrote:
> 
>>There was an eye-wateringly detailed look at the performance of
>>various implementations of roughly this algorithm on this list two
>>years ago: tail recursion + reverse, continuation-passing style,
>>etc. I don't know if the conclusions are still accurate for R9C, but
> 
> You could always make a new test with help of the benchmark framework
> provided in the efficiency guide to come to the right conclusion for R9C.
>