[erlang-questions] Erlang shows its slow face!

Wed Nov 17 06:12:29 CET 2010

On 17/11/2010, at 3:51 AM, Evans, Matthew wrote:

> Interesting thread...I think we can all take the following from this:
> 
> 1) Write your solution in the "Erlang way", using Erlang's strengths such as spawning parallel processes, also HIPE and NIFS where appropriate.

I think this is important for any language:  you really do need to learn how
to work *with* the language.

We are *all* of us having to learn new ways of programming.
I've got quite good at seeing how to make single-core programs go faster.
Seeing the parallelism in a problem is still a challenge for me.

Let's not forget that one of Erlang's strengths is distribution.
Moving a computation where the data is has been a good idea (sometimes!)
for a long time.  Moving a computation is NOT necessarily the same thing
as moving executable code; it could be some sort of data structure
requesting a particular activity, which might or might not result in new
code being generated, compiled, and loaded on a remote server.

> 2) Write your solution efficiently - true in all languages, but especially in Erlang where you could end up doing needless recursion.

"needless recursion"?  There is nothing special about needless *recursion* to
warrant distinguishing it from any other kind of needless work.

I'm not sure that I go along with this.  When I presented the A and B determine C
version of the Pythagorean triple finder, I *knew* that there were more efficient
ways to do it.  The code was efficient *enough* to make my point.

So let's say "Don't make your solution blatantly inefficient".

People often focus on computational steps when thinking about efficiency.
Memory is also something to think about.
Let's take that 100 million element data set I mentioned earlier in this
thread.  It's basically just a rather large sparse matrix with
	500,000 rows
	 20,000 columns
    100,000,000 nonzero entries (they fit in a byte).
Store it as a full array, and you need 10 GB of memory.
The machine I'm typing on today has 3 GB.  Oops.
One way to represent a sparse matrix is as an array of
pointers to pairs of (column number, value) arrays,
which works out at about 303 MB of memory.
The method I'm using takes about 210 MB of memory.

It's not just the quantitive difference, taking >47 times less
memory, it's the qualitative difference between being able to
do the job in the memory I have or not.

By the way, the same data structure could be adapted to Erlang,
taking about 434 MB.  I haven't tried it, and it would take a fair
bit longer to load, but it should work.  If done, this would
provide an example of
	Erlang + efficient data structure : slowish but workable
	C + inefficient data structure : unusable

In much the same way, I once had a statistical calculation where
I wanted to use a published Fortran algorithm.  I had trouble
getting the published code to work, so I rewrote it in Prolog.
When I had that going, I used it to help debug the Fortran
version.  And at the end of the day, the Prolog version was
*faster*, because it used a better data structure.  And the
Prolog code wasn't even running as native instructions!
(To this day, rewriting something tricky in another language is
a debugging strategy I find helpful.)

> 3) Erlang isn't always the best language for all problems. Neither is C, C++, C#, Java or any other language. Choose the language, or a mix of languages, that are the best fit your problem.

It's also important to be clear about what's *problematic* about the problem.
*Development* time and *execution* time are different.
If you are trying to develop embedded software for something where the (moral)
cost of failure is high, like a medical device, the SPARK subset of Ada,
with an actual attempt to verify as much as you can of the software, would be
a good choice.  If you are trying to enumerate Pythagorean triples for fun,
that would be a bad choice.  In some projects, what you need to do is to
move quickly from prototype to prototype as you learn more about the problem,
the domain, the risks, the clients, &c.

> What I would like to add is that we are trying to get Erlang accepted in a very C/C++ centric development environment. Although time and time again Erlang has shown itself a superior language and development environment for our domain, there is still a big problem in people accepting it. One of the main stumbling blocks is it is perceived to be "slow". People always say "it won't be faster than a C/C++ solution", or "a language written in a VM will always be slow [rolleyes]".

Well, Erlang *is* slow at arithmetic.  The question is whether that really matters.

Let's put this in perspective.
About 12 years ago I was an expert witness in a court case.
Amongst other things, the contractor claimed that the reason the
program was too slow was that the client's PC was too feeble.
If memory serves me, the PC was one that the contractor had
recommended when the project began.  I pointed out that the
whole of the banking in this country had been done 20 years
earlier on a mainframe with far less memory, far slower, and
with somewhat slower discs, so the machine really ought to be
able to handle one shop!

Today's machines are awesome.  My laptop can sort 250 million
64-bit floats in under a minute (albeit not using qsort()).
I was about to try a bigger benchmark when I realised that was 2GB.  
I used to be able to get times like that with a few thousand numbers.

We now see people _happily_ developing entire applications in
(sorry; I don't like to be offensive, but it's true) PHP.
We see *major* natural language processing frameworks like GATE
written in Java, and project management tools like Colibri.

People will tell you that Java can be, or is about to be, or even
is, as fast as C.  Every benchmark I've ever tried, some of them
quite recently, says NO.  But it's fast _enough_.

> 
> We can argue until we are blue in the face that it will be as good or better when you throw multi-cores in the mix.

I'm sorry, but I don't believe that Erlang will *ever* run faster than
C.  What we need to remember is that *most* large projects fail.
The question we should be looking at is not
	"will a [multicore] Erlang program be faster"
	"will a [multicore] C program be faster"
but
	"will a [multicore] Erlang program be DELIVERABLE"
	"will a [multicore] C program be DELIVERABLE"

An Erlang program that *exists* is infinitely faster than a C program
that had to be abandoned.  A slow Erlang program that customers can
actually *use* is going to earn you more money (that you can spend on
rewriting parts of it in C if you really want to) than a C program
that isn't ready for release outside the Dilbert zone.

You know, a lot of people think that their program is efficient
*because it is written in C*.  But there is a lot of inefficient
code written in C.  I'm sure we've all seen things like

	for (i = 0; i < strlen(s); i++)
	    if (s[i] == ' ') s[i] = '_';

It might well be that the best way to overcome the "speed" bias
would be to look at some recently written code and show how fast
it *isn't*.

A colleague here once challenged me to improve a back-propagation
program he had written in C.  Thinking like an APL programmer I
speeded it up by a factor of 2.5.  That code was actually a
half-way house to using the BLAS, but since I'd made my point,
I stopped there.

> for many, the "perceived" slowness is one factor that prevents them developing in Erlang.

Just like the way the "perceived" slowness of garbage collection
stopped people using Lisp, until along came Java, and suddenly
GC was respectable.  Yet despite that, and despite experiments
finding faster development in Lisp *and* better performance in the
result, people still stayed away from Lisp.

Wearing my complete cynic's hat here, I suspect that if someone developed
an alternative syntax for Erlang that *looked* like C, it might be more
attractive to the hard-of-thinking.