[erlang-questions] Breaking out of a foldl

Tue Jun 2 09:41:47 CEST 2009

Absolutely - throw is very cheap.

 The bfoldl/3 function that was suggested is not tail recursive - also
we have to pattern match the return value  of the Fun at *every* level
of recursion.

Recall the definition:

bfoldl(Fun, Acc0, [Head | Tail]) ->
   case Fun(Head, Acc0) of
       {ok, Acc} ->                              %% 1
           bfoldl(Fun, Acc, Tail);
       {break, _Acc} = Result ->        %% 2
           Result;
       break ->
           {break, Acc0}
   end;
...

In lines 1 and 2 the return value of the Fun being called has to be
pattern matched

lists:foldl/3 is defined thusly

foldl(F, Accu, [Hd|Tail]) -> foldl(F, F(Hd, Accu), Tail);
foldl(F, Accu, []) when is_function(F, 2) -> Accu.

This is tail recursive so executed with a stack depth of one, whereas
bfold/3 executes with a
stack depth of the length of the input list.

Now we can ask the questions:

    - which is prettier (code with bfold/3 of code with foldl/3), and
    - which is faster.

I'll write some code to test this. t1(L) sums the lists of the elements in L
but breaks the loop if there is some element in the list which is 95000.

First the timer code:

-module(test1).
-compile(export_all).

test() ->
    L = lists:seq(1,100000),
    T1 = timer:tc(test1, t1, [L]),
    T2 = timer:tc(test1, t2, [L]),
    {T1, T2}.

Now the functions:

t1(L) -> bfoldl(fun f1/2, 0, L).

f1(95000, _) -> break;
f1(N, A)        -> {ok, N + A}.

and

t2(L) -> (catch lists:foldl(fun f2/2, 0, L)).

f2(95000, A) -> throw({break, A});
f2(I, A)          -> I + A.

Both are three lines - no clear winner in the beauty competition.
Now either what this means is clear from the code, or there should be a comment.

Timeing:

c(test1).
{ok,test1}
1> test1:test().
{{12359,{break,4512452500}},{5890,{break,4512452500}}}
2> test1:test().
{{11320,{break,4512452500}},{6231,{break,4512452500}}}

The standard library version is twice as fast

bfoldl is 260 characters foldl 109 characters

/Joe

On Mon, Jun 1, 2009 at 9:05 PM, David Mercer <dmercer@REDACTED> wrote:
>> Yeah, but it's syntax is somewhat misleading. People with
>> Java/C++/Eiffel/etc. background would expect that exceptions should be
>> used in exceptional situation, not instead of a return statement
>> (which Erlang lacks).
>
> But throw/1 is not used to throw exceptions: it is used to throw (quoting
> from the documentation) "A non-local return from a function."  That seems to
> fit the bill in this case.  That other languages make throwing expensive
> enough to recommend that it only be used in exceptional circumstances is a
> restriction on their implementations, not Erlang's.
>
>> -----Original Message-----
>> From: erlang-questions@REDACTED [mailto:erlang-questions@REDACTED] On
>> Behalf Of Attila Rajmund Nohl
>> Sent: Monday, June 01, 2009 1:44 PM
>> To: erlang-questions@REDACTED
>> Subject: [erlang-questions] Breaking out of a foldl
>>
>> 2009/6/1, Joe Armstrong <erlang@REDACTED>:
>> > On Sat, May 30, 2009 at 6:19 PM, Mazen Harake
>> > <mazen.harake@REDACTED> wrote:
>> >> Because it is a hack?
>> >>
>> >> a "fold_until" would be much smoother.
>> >
>> > Uuugh - smoothness?????????????
>> >
>> > Why have two functions when one (the existing foldl) is perfectly
>> adequate?
>> >
>> > The use of throw to abnormally terminate a recursion is not a hack.
>> > That is what
>> > catch-throw was designed to do. exit/1 is for raising errors, throw/1 is
>> for
>> > abrupt termination of a computation.
>>
>> Yeah, but it's syntax is somewhat misleading. People with
>> Java/C++/Eiffel/etc. background would expect that exceptions should be
>> used in exceptional situation, not instead of a return statement
>> (which Erlang lacks).
>>
>> [...]
>> > Wirth (the blessed) said something like (paraphrased) every time we
>> > add something
>> > to a language we should ask *what can we remove*. If we just add stuff
>> > to languages
>> > (or libraries) - without removing stuff, we add additional complexity
>> > so we violate
>> > the principle of being as simple as possible.
>> >
>> > I am often horrified by libraries that offer dozens of different ways
>> > to do essentially the
>> > same thing - this makes the documentation almost unreadable (since it is
>> > unclear
>> > which the kernel methods are) and makes learning very difficult.
>> >
>> > Libraries should provide a small set of primitive parts which can be
>> > glued together
>> > to solve a large number of problems. If the set of parts is large then
>> > learning the library will be very difficult. If you don't know all the
>> > functions in a library then
>> > programming using the library will be painfully slow since you will
>> > have to google
>> > your way through large volumes of documentation.
>> >
>> > If you really really want fold_until then I suggest you make your own
>> > personal library
>> > (mylib.erl) where you put fold_until (and every other additional
>> > function that you think
>> > is missing from the standard libraries) - then one day when you are
>> > satisfied that
>> > your library has proved useful you can publish it.
>> >
>> > Exactly what should be in the standard libraries is an extremely
>> > difficult problem-
>> > I think they should contain small sets of orthogonal functions and err
>> > on the side
>> > of generality. If two function do more of less the same thing one of
>> > them should be removed.
>>
>> I disagree. Every Erlang project I've seen had it's own tracing module
>> built on top of dbg. They all had very similar functions that did
>> nearly the same thing. This obviously means that the dbg module needs
>> a function that simply turns on tracing on a function of a module.
>> Yes, it could be done with dbg:tpl, but why shall I type every time
>> [{'_',[],[{return_trace}]}] into the shell? Also the equivalent of
>> lists:keyfind was probably implemented countless times in most Erlang
>> projects.
>>
>> I do think it's better to have often needed code in the standard
>> library. First of all, it's standard, so every programmer should be
>> familiar with it. On the other hand the "helper" functions in the
>> various mylib libraries are probably not well known to anyone except
>> the author. The above mentioned tracing modules had functions doing
>> essentially the same, but each with a slightly different syntax which
>> is quite annoying.
>>
>> Using third party libraries (e.g. someone's released mylib) has other
>> drawbacks. It might not be useable at all due to licensing. It's an
>> other dependency that has to be followed (e.g. when a bugfix release
>> comes out). It might not be maintained anymore. It doesn't have the
>> same user base as the standard library, so it might have more bugs.
>> From the management point of view it's better to work with the
>> standard library and in-house code, without 3rd party libraries.
>>
>> I think that the standard library should first and foremost make
>> programming easier. For example, the JTable class in Java has 7
>> constructors. A couple of them probably could be removed to get a more
>> minimal code - but the removed code will popup in the applications
>> many more times and probably with many more bugs.
>>
>> ________________________________________________________________
>> erlang-questions mailing list. See http://www.erlang.org/faq.html
>> erlang-questions (at) erlang.org
>
>
> ________________________________________________________________
> erlang-questions mailing list. See http://www.erlang.org/faq.html
> erlang-questions (at) erlang.org
>
>