[erlang-questions] Parallelism lesson for an idiot

Fri Apr 5 05:10:08 CEST 2013

The idiot in question was me.  A colleague handed me a book about
parallel programming, and I thought, well, maybe in the concurrent
programming paper I should show the students a parallel sorting
algorithm using POSIX threads.  So I wrote one.  In C.  I was ever
so pleased.  It worked first time.

The problem is that it's 1.06x to 2.61x _slower_ than the sequential
sorting algorithm it uses as a base, for _any_ number of threads I
tried, on a 4-core machine.  A few minutes' thought told me
why.  In order to get a time that was worth speeding up, I gave it a
lot of data to sort.  And that meant that the memory bus was being
saturated.  And since there's only one of _that_, there was no way
that having multiple cores was going to make anything faster.

The thing is that I've known for a couple of decades that managing
*memory* is the key to good performance.  I think I forgot to think
about _what_ I was parallelising and whether that was _everything_
that needed to be parallelised.

This will be shown to the concurrent programming students after all,
but as a dreadful warning.

One of the things I like about Erlang is that it gently leads me
away from the illusion that shared memory is cheap.