[erlang-questions] foldl vs foldr and ++

Thu May 10 11:08:13 CEST 2012

Try this benchmark program with, e.g. L = lists:seq(1,100000).

-module(lr).
-export([l/1, r/1, x/1, foldr/3]).

l(L) ->
    timer:tc(lists, foldl, [fun erlang:'+'/2, 0, L]).

r(L) ->
    timer:tc(lists, foldr, [fun erlang:'+'/2, 0, L]).

x(L) ->
    timer:tc(?MODULE, foldr, [fun erlang:'+'/2, 0, L]).

foldr(F, Acc, L) -> lists:foldl(F, Acc, lists:reverse(L, [])).

lists:foldl/3 and lists:foldr/3 run a fairly even race, whereas lr:foldr/3 is consistenty slower.

BR,
Ulf W

On 10 May 2012, at 10:53, Hynek Vychodil wrote:

> When I see foldr implementation I wonder why it is not simple
> 
> foldr(F, Acc, L) -> foldl(F, Acc, lists:reverse(L, [])).
> 
> There should be less memory consumption. lists:reverse is BIF which I
> believe consume less memory than stack frame per each list element
> what current foldr implementation do. It should be faster and both
> accepts only proper lists. Only difference which I can found current
> implementation consumes memory on stack but above implementation would
> consume heap and affect GC behavior. Current foldr implementation
> would make sense in Haskell but I don't understand its reason in
> Erlang.
> 
> On Thu, May 10, 2012 at 10:26 AM, Robert Virding
> <robert.virding@REDACTED> wrote:
>> The documentation is referring to the *implementation* of the foldl/foldr functions, foldl uses a tail-recursive function while foldr cannot do that. The difference in speed comes from what you do inside the fold function. As others have pointed out the efficiency of ++ depends solely on its left argument.
>> 
>> Robert
>> 
>> ----- Original Message -----
>>> Aha I just looked at my own explanation and realized it made no
>>> sense. Of course the slow one is slow for exactly the reason I
>>> expected. Sorry for the waste, but thank you for the clear
>>> complexity analysis.
>>> 
>>> Andrew Ledvina
>>> 
>>> On May 9, 2012, at 2:57 PM, "Richard O'Keefe" <ok@REDACTED>
>>> wrote:
>>> 
>>>> 
>>>> On 10/05/2012, at 7:00 AM, Andrew Ledvina wrote:
>>>> 
>>>>> Okay, so I was profiling something and it led me to write this:
>>>>> 
>>>>> from_left(X) ->
>>>>>   lists:foldl(fun(Y,R) -> R ++ [{Y+1,Y*2}] end, [], X).
>>>>> 
>>>>> from_right(X) ->
>>>>>   lists:foldr(fun(Y,R) -> [{Y+1,Y*2}] ++ R end, [], X).
>>>> 
>>>> Immediate reaction:  from_left/1 is going to be atrociously slow.
>>>> 
>>>>> My understanding is that foldl is preferred over foldr because it
>>>>> is tail recursive.
>>>> 
>>>> ***other things being equal***
>>>> 
>>>> In this case, they are not.  It should be clear that from_right/1
>>>> is
>>>> O(|X|) because it does O(1) work per element of X, while
>>>> from_left/1 is O(|X|**2) because it does O(|R|) work per element of
>>>> X, and R grows from empty to 2|X|-2 elements long.
>>>> 
>>>> Write [{Y+1,Y*2} || Y <- X] and let the compiler worry about the
>>>> best way to do it.
>>>> 
>>>>> I was also to understand that putting the accumulator on the
>>>>> left-hand-side of the ++ operation is super bad. So, the
>>>>> following results were a bit of a surprise at first:
>>>> 
>>>> If you understood that long++short is more expensive than
>>>> short++long,
>>>> you should NOT have been surprised by your result.
>>>> 
>>>> 
>>> _______________________________________________
>>> erlang-questions mailing list
>>> erlang-questions@REDACTED
>>> http://erlang.org/mailman/listinfo/erlang-questions
>>> 
>> _______________________________________________
>> erlang-questions mailing list
>> erlang-questions@REDACTED
>> http://erlang.org/mailman/listinfo/erlang-questions
> 
> 
> 
> -- 
> Hynek Vychodil
> Chief Scientists
> 
> GoodData
> náměstí 28. října 1104/17, 602 00, Brno - Černá Pole
> Office:   +420 530 50 7704
> E-mail:  hynek@REDACTED
> Web:     www.gooddata.com
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://erlang.org/mailman/listinfo/erlang-questions

Ulf Wiger, Co-founder & Developer Advocate, Feuerlabs Inc.
http://feuerlabs.com