[erlang-questions] Erlang shows its slow face!

Wed Nov 17 06:45:09 CET 2010

On 17/11/2010, at 4:34 AM, Edmond Begumisa wrote:

> On Wed, 17 Nov 2010 01:20:51 +1100, Kostis Sagonas <kostis@REDACTED> wrote:
> 
>> There are two basic issues here:
>> (1) The Erlang compiler doesn't really handle list comprehension as
>>    native loops; it compiles list comprehensions to recursive
>>    functions.  lists:seq(1, N) is not special syntax -- unlike Haskell --
>>    and almost surely builds an actual list.
>> Yes.  This is an issue.  Actually, it's more general: in my experience most Erlang programmers do not understand the difference between list comprehensions and lists:foreach/2. Many of them find list comprehension notation so nice that they often use it in places where building the list is not only is not their intention but is completely unnecessary.
>> The compiler is not sufficiently clever to perform deforestation and avoid construction of intermediate lists.
>> 
> 
> First I'm hearing of this. It really ought to be in one of those green boxes in the documentation.

Let's go around this one more time.

The current Erlang compiler turns

	[f(X) || X <- lists:seq(1, N)]

into
	<foo>(lists:seq(1, N), ...)
where
	<foo>
is a function generated by the compiler.

The compiler *could* make a special case of lists:seq/[2,3]
and generate

	<foo>(1, N, ...)

But it doesn't.  The actual definition of lists:seq/2 not
only *could* change, in the not too distant past it *has*
changed.  The compiler can't inline the definition and
use whether it is now, because it's a remote call (even
if -imported), so it might change at run time.

Haskell compilers and the Mercury compiler *do* perform
deforestation because they are given an unchanging program
to work with and can safely do cross-module inlining.
Erlang can't.

This would be a potentially serious problem if iterating
over integer ranges were something Erlang spent a lot of
time doing.

I just checked an Erlang/OTP release I keep handy for such
things.  
    ~ 1,500,000 raw lines
    ~   950,000 SLOW
             34 list:seq iterations
	     18 in two EUNIT tests
	      3 in other test cases (at least)
	     13 list:seq not in an ?_assert
    ~      3200 Pat <- Exp

The unit tests don't matter for performance.  We're left
with about one SLOC in every 72,000 being an iteration of
this kind, or about <- in every 250 being an integer
iteration.

The Erlang/OTP team probably feel they have other maddened
grizzly bears to stun before they bother with something that
seems to turn up mainly in toy benchmarks, and who could blame them?