[erlang-questions] trouble with erlang or erlang is a ghetto

Wed Jul 27 08:08:15 CEST 2011

On 07/26/2011 09:42 PM, Richard O'Keefe wrote:
> (3) There doesn't seem to be anything in the Erlang _approach_ that
>     should interfere with scaling, but the _implementation_ appears
>     not to scale well much past 16 cores.
>
>     I don't know if that is current information.  If true, it's an
>     important limitation of the implementation which I'm sure will be
>     given a lot of attention.  I don't expect it to stay true.

Any problems scaling on systems with more than 16 cores just seems to be related to the current cost of those systems (would love to know what the tests currently show on the Tileras > 16 cores though).  Erlang appears to have much more natural scalability when you compare it to Java, and the criticism of the Erlang garbage collector is unsubstantiated (as previously mentioned).  You can not say an approach is wrong because it isn't the Java-way, or perhaps the Sun-way.  The per-process garbage collection avoids central state, so it encourages scalability with a design for parallelism.  Throwing a ton of money at Azul to push a single heap garbage collector beyond normal limits might sound fun, but it only shows how long and drawn out technical failure can be (like Sun's stock price was for instance, an almost perfect bell curve).

> (4) He doesn't seem to like the Erlang garbage collector, but that's
>     something which has changed more than once, and he does not
>     offer any actual _measurements_.
>
>     I tried the experiment of allocating 1,000,000,000 list cells
>     (but never keeping more than 10,000 of them at a time).
>     Erlang, byte-codes:  7.58 seconds (=  7.58 nsec/allocation).
>     Erlang, native    :  3.92 seconds (=  3.92 nsec/allocation).
>     Java -O -server   : 11.40 seconds (= 11.40 nsec/allocation).
>     Java -O -client   : 12.26 seconds (= 12.26 nsec/allocation).
>
>     Java has come a long way.  I don't have an Azul system to try.
>
>     He praised tcmalloc.  I note that it only recently became usable
>     without pain on my laptop (MacOS X) and that building it produced
>     reams of warning messages about using a deprecated interface, so
>     it may not work much longer.  It doesn't work at all on the other
>     machine on my desk.  (Erlang works on both.)  I wrote a similar
>     benchmark in C and linked it with libtcmalloc.a.  I killed that
>     program after it had run for more than 10 times as long as the
>     Erlang code.  So when he says "Erlang ... can't take advantage of
>     libraries like tcmalloc", there doesn't appear to be any
>     advantage that Erlang *could* take.
>
>     In short, Erlang's memory management is criticised, and it MAY be
>     that this is justified, but the blog entry provides no EVIDENCE.
>
>     By the way, savour the irony.  "Erlang's approach [of] using
>     separate heaps per process", which he criticises, is in fact
>     used elsewhere: ptmalloc does it, the tcmalloc documentation
>     makes it absolutely clear that tcmalloc does this also (more
>     precisely, it uses a per-thread cache, which is what the "tc"
>     part of the name means), and some recent Java systems have
>     done the same thing, with a per-thread cache for memory
>     management, backed by a shared heap.  The point of the per-
>     thread cache is to reduce locking.  There is a spectrum of
>     approaches from nothing shared to everything shared, and it
>     seems clear that everyone sees merit in not being at either
>     extreme.  Progress must be driven by measurement.

It is hard to believe that there is controversy still here.  Per-process garbage collection avoids shared state which avoids the need for low-level locking which makes Erlang scalable.  To argue for some other approach in Erlang seems like idiocy to me, because it wouldn't be a real Actor model that can provide fault-tolerance (actually keep failures isolated).  Why would you want to fool around with some broken fake Actor implementation that is unscalable (like your operating system, for instance)?  Seems like a waste of time, just like this garbage collector complaint.

Yes, HiPE has issues, Erlang is not meant for all programs, and the syntax is different.  Nothing is perfect.

Not even the standard library is perfect.  I have yet to hear of a perfect standard library in any language.  I think having problems with a standard library is a natural problem because the people that write the standard library impose a taxonomy on functionality that not everyone shares naturally, because we are not psychic.  Learning the standard library just seems like a natural process in computer science when you get to relate a standard library to the others you had the misfortune to use in the past.  Why bother complaining about a taxonomy that is just as bad as any other?  If it serves the purpose it was designed for well, then there is no reason to care, it just happens to be different from what you might expect based on your own limited knowledge.

> And so it goes.
>
> There *are* things about Erlang that can be improved.
> Some things *have* been improved, some things are being improved,
> and some things don't need anyone to wait for Ericsson to do them.

The frustration with these issues seems natural and common, both on this mailing list and elsewhere.  However, I think it is important to be proactive rather than giving in to emotional arguments that lack justification or evidence.