[erlang-questions] Keeping massive concurrency when interfacing with C

Mon Oct 3 08:32:01 CEST 2011

Hi John,

There are few things here I may add.

As Kresten said, the number of real parallel threads you can run on a 
computing element depends on the number of cores. Nevertheless, making 
your code to run in threads may bring an advantage in certain cases.

Now, what I missed from your e-mails is how would you like the 
information from Erlang to be processed. Meaning, is the information 
processed by an Erlang thread linked to another running thread? Or each 
thread has its own distinct information? These questions need to be 
answered before you proceed further in designing your application.

In case of linked information, than serializing the data seems not so 
bad idea (depending of the level of relation in between threads data). 
Otherwise, in the case of independent data per thread, you don't need to 
worry about creating 20k threads in C (using NIF's), but just create a 
dynamic library .so (shared object) which you have to load it before 
starting your Erlang threads. Linux will take care of the rest (it will 
create as many data instances within your library as required). You just 
need to take care that your library to be thread safe (mainly, no memory 
leaking and not to try to use more memory in the buffer than you have 
physically).

If you wonder if to use Erlang or simple C (or any programming language 
for that matter), think firstly about what you need in the end. All of 
us would like to have super-speedy applications by squeezing the maximum 
the computational power from our hardware, but we all need to make some 
compromises. What I perceived from using Erlang is that this is not 
suitable for regular desktop applications, but, instead, it's a very 
handy tool when developing applications such as non-blocking complex 
data processing and fast network applications (and may be more, but I 
used Erlang only for that for the time being). It's not that you cannot 
obtain all those by writing your applications in C, but why reinventing 
the wheel when you can just have it? Erlang is robust enough to give you 
a nice environment for these kinds of applications.

Concluding, using Erlang is just a matter of taste and how comfortable 
you feel yourself with such a programming language. Searching for 
benchmarks of a programming language doesn't help you too much because 
they are usually made for certain conditions which, in 90% of the cases, 
do not fit your needs. In this case, you need high concurrency, I 
suggest you to consider more cores of lower frequency better than fewer 
cores of higher frequency (or, if you can afford GPU instead of CPU). 
Keep in mind that whatever you will choose, you will always be 
restricted by your hardware and for the few milliseconds you may gain 
per process you need to work hours if not days.

Good luck!

Cheers,
CGS

On 10/03/2011 03:44 AM, John Smith wrote:
> Thanks for the reply, Kresten!
>
> I definitely would not be doing any disk I/O in the C code. It would
> be intense number crunching, so it would be CPU (and perhaps memory)
> bound. Everything I've read states Erlang is not good at number
> brunching (Cesarini mentions this in his "Erlang Programming" book) so
> I'm considering writing the code to do that in C.
>
> If I call a NIF, only the particular scheduler that manages that
> Erlang process would be blocked and no other scheduler, right? So for
> example, if I have a CPU with eight cores, and an Erlang scheduler
> thread is running on each core, and say the third scheduler is
> executing an Erlang process that calls a NIF (and so blocks), only
> that scheduler would be blocked until the NIF finishes executing,
> correct?
>
> I'm debating which solution would be better. Erlang would be slower at
> number crunching, but is extremely efficient at managing concurrent
> executing processes, meaning each would gradually make progress every
> X units of time since they'll all get a turn to execute. But I wonder
> if having a single process execute NIF code until it finishes (and so
> all the processes managed by a single scheduler execute serially)
> would be faster than implementing it all in Erlang and having
> processes execute concurrently within a single scheduler (albeit the
> code would be slower to execute). There would be less overhead of
> Erlang process context switching (although admittedly that isn't much
> to begin with) and the C code would be faster at number crunching. I
> suppose there's only one way to find out! :)
>
> I was also thinking about writing the number crunching code in some
> other language than C, such as OCaml. OCaml has a reputation for being
> as fast as C, yet not nearly as low-level. Maybe that would be a good
> fit with Erlang.
>
> Example benchmarks:
> http://shootout.alioth.debian.org/u32q/benchmark.php?test=all&lang=hipe&lang2=gpp
>
> http://shootout.alioth.debian.org/u32q/benchmark.php?test=all&lang=ocaml
>
> The Erlang benchmark was using HiPE as well.
>
> Thanks for the suggestion!
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://erlang.org/mailman/listinfo/erlang-questions