[erlang-questions] foldl vs foldr and ++
Masklinn
masklinn@REDACTED
Wed May 9 21:20:27 CEST 2012
On 2012-05-09, at 21:00 , Andrew Ledvina wrote:
> Okay, so I was profiling something and it led me to write this:
>
> from_left(X) ->
> lists:foldl(fun(Y,R) -> R ++ [{Y+1,Y*2}] end, [], X).
>
> from_right(X) ->
> lists:foldr(fun(Y,R) -> [{Y+1,Y*2}] ++ R end, [], X).
>
> timeit(N) ->
> X = lists:seq(1,N),
> statistics(runtime),
> statistics(wall_clock),
> from_left(X),
> {_,LTime1} = statistics(runtime),
> {_,LTime2} = statistics(wall_clock),
> from_right(X),
> {_,RTime1} = statistics(runtime),
> {_,RTime2} = statistics(wall_clock),
> io:format("Left runtime: ~p (~p)~n", [LTime1,LTime2]),
> io:format("Right runtime: ~p (~p)~n", [RTime1,RTime2]).
>
> My understanding is that foldl is preferred over foldr because it is tail
> recursive. I was also to understand that putting the accumulator on the
> left-hand-side of the ++ operation is super bad. So, the following results
> were a bit of a surprise at first:
>
> Erlang R14B01 (erts-5.8.2) [source] [smp:4:4] [rq:4] [async-threads:0]
> [hipe] [kernel-poll:false]
>
> Eshell V5.8.2 (abort with ^G)
> 1> listtime:timeit(10000).
> Left runtime: 290 (286)
> Right runtime: 0 (1)
> ok
> 2> listtime:timeit(100000).
> Left runtime: 30670 (31011)
> Right runtime: 10 (13)
> ok
>
> The times are in ms. 30 s versus 10 ms. The result of both functions
> from_left and from_right are identical. Also, I am clearly aware that these
> are both equivalent to just a more simple map. I think I understand the
> results after a bit of a go around but is there anyone who has a simple,
> clear explanation?
Well the explanation is exactly in what you wrote:
> I was also to understand that putting the accumulator on the
> left-hand-side of the ++ operation is super bad.
that is indeed the case: in your left fold, the runtime has to copy all of
the accumulator to add data to its end, whereas in your right fold it just
has to create a new cons and append the old accumulator to it.
The cost of cloning thousands of conses at each step is way over any potential
saving from a tail-recursive left fold.
To see this, replace your from_* functions by these:
from_left(0) ->
[0];
from_left(N) ->
from_left(N-1) ++ [N].
from_right(0) ->
[0];
from_right(N) ->
[N] ++ from_right(N-1).
and you'll get this:
1> listtime:timeit(10000).
Left runtime: 390 (401)
Right runtime: 0 (0)
ok
2> listtime:timeit(100000).
Left runtime: 56450 (57424)
Right runtime: 10 (18)
ok
The operation which is costing you is long_list ++ element, because all of
long_list has to be copied.
Oh, and by the way you could (should?) use [Element | List] instead of
[Element] ++ List.
PS: just to prove that a tail-recursive fold is nowhere near good enough to
compensate for the millions of extra copies, here it is:
from_left(0, Acc) ->
Acc ++ [0];
from_left(N, Acc) ->
from_left(N-1, Acc ++ [N]).
6> listtime:timeit(10000).
Left runtime: 310 (316)
Right runtime: 0 (1)
ok
7> listtime:timeit(100000).
Left runtime: 52720 (54166)
Right runtime: 10 (17)
ok
More information about the erlang-questions
mailing list