[erlang-questions] ot: Swedish article featuring Erlang Solutions CTO

Jesper Louis Andersen jesper.louis.andersen@REDACTED
Mon Jun 13 11:46:17 CEST 2011

On Mon, Jun 13, 2011 at 10:41, Robert Virding
<robert.virding@REDACTED> wrote:
> What is perhaps more interesting than the article itself are all the comments. "I know nothing about language X but I know that it is 10-15 times slower than language Y".

He is considering the "Computer Language benchmarks game",
specifically the "fasta" benchmark on a Quad Core:


You can make the opposite conclusion by considering the thread-ring benchmark:


Which shows Haskell is fastest, Erlang is about 3 times slower and GCC
is 27 times slower than Haskell. The beauty is indeed in the eye of
the beholder.

The "fasta" benchmark is not very Erlang-optimized either. First, the
benchmark has three parts which are independent. We are only using one
out of the 4 CPUs at the moment. Second, the first part is about
cycling on an array. It is straightforward to do in C, and we will
have more work to do. Rearranging the code around the binaries used
can probably yield a speedup. For starters, we could ask if BEAM
optimizes binaries at all. But the run time is probably dwarfed by the
next two parts, so going parallel should be the first step.

Third, the last two parts are about lookups in tables. The C-version
builds up an array in which it looks up values quickly in constant
time. The array is built such that it minimizes floating point
roundoff cases and uses a branch-prediction hint to make it faster.
The Erlang code creates a list and does O(n) lookups in the list.
Finally, this is floating point intensive, an area where Erlang
doesn't really shine too much.

In other words: Build up a tuple of 4096 elements like in the
C-version and use it with element/2 to get constant time lookup and
see what happens :)


The danger of concluding something from a synthetic benchmark is that
you are going to conclude the wrong thing. Especially if you don't
know the internals of the language. The Haskell version for instance
does caching of the randomized lookup table as well. This ought to
speed up the computation a bit for Haskell as well. I Dare say the
algorithm used is different when considering the three languages.

Also, note that the benchmark is *extremely* cache-friendly. All data
easily fits into the CPU cache and you will never ever need to move
anything from main memory if you do it somewhat correctly. Since you
are not waiting on the memory bottleneck at all, of course it will be
blindingly fast. In my experience, a language with the traits of
Erlang (dynamically typed, interpreted) fares worse here than other
languages. If the memory bottleneck comes into play, then not so much.


Finally, the whole notion that "Functional Programming is slow" is a
fallacy. Yes, back in 1981 when memory was constrained and the only
thing you had were interpreted Lisps, then yes, *perhaps*. What is
more likely however is that people did not understand how to program
functionally. Today, 30 years of research later, there has been so
much gain that it is not clear at all it is slow on a single core. And
for multiple cores...


More information about the erlang-questions mailing list