[erlang-questions] Millions of processes?

Wed Sep 24 16:28:48 CEST 2008

On Wed, Sep 24, 2008 at 3:32 PM, Valentin Micic <v@REDACTED> wrote:
> I do not understand *why* do we even compare ERLANG processes with POSIX
> threads? What is the connection??

It's valuable to spread Erlang mindshare.

What leaks out about Erlang to the outsiders is how Erlang is good for
making use of multi-core computers and distributing computations. And
while Erlang has some great examples of applications that scale very
well with the numbers of cores available, its not really a
representative picture that you go to Erlang to write your ray-tracer
if you want it to run 16 times faster on a 16-core machine than a
1-core machine.

The advice represented in the Linux FAQ mentioned above probably comes
from people that are worried about using the optimal number of threads
for their computations on a specific hardware platform to get the most
out of it.

I'd like to call the distinction: using threads for "modeling"
reasons, and using it for "technical" reasons.

In Erlang we use Erlang processes for modeling reasons, it is simpler
to program if you map each concurrent activity in a system to a
process, so code for each process only have a single sequential job to
focus on (and to get right).

Technical reasons to use threads or processes are those that are
unrelated to making the modeling simpler, often they make the model
more complex.  Performing kernel convolutions on large images are
typical of something that is very simple to express as a single
sequential job, its basically four nested loops, for each pixel in the
image, you sum the products of the kernel coefficients with a pixels
in a region around the current pixel. This means each pixel in the
result image only depends on the kernel and a NxN region around that
pixel in the input image. There is potential for huge gains in speed
in having 8 cores performing the convolution concurrently for 1/8th of
pixels in the image each. But you need to do it right, you must make
sure each core make the most of the data it gets into its cache-lines,
otherwise you risk having your data bus being the bottleneck and you
wont get a 8 time speedup.

A lot of people coming to Erlang wonder where the libraries are for
splitting up a job into parts executed in multiple processes so they
can make this "technical" use of having multi-core computers.
However, most use of Erlang is in domains where you have embarasingly
simple parallelism.  A web application can easily have 100 concurrent
requests to process, and when modelled as an isolated process each,
you can already potentially make use of a 96 core machine, which most
of us can only wish for in a year or two right now.