[erlang-questions] Keeping massive concurrency when interfacing with C

Sun Oct 2 23:10:31 CEST 2011

Hi everyone,

>From my understanding, there are four main ways to interface Erlang
with C:

* C Node
* Port
* Linked-In Driver
* NIF (Native Implemented Function)

My problem is if I have, for example, spawned 20,000 Erlang processes
and I want them all to execute concurrently but they need to call C
code, how can I have that C code run concurrently without having to
spawn 20,000 threads in C (which would probably crash the OS) or using
obscene amounts of memory?

I've been reading over the examples of a C node, and it seems if
20,000 processes all send a message to the C node, the node will
process them one-by-one and not concurrently, so it becomes a
serialized bottleneck. Spawning 20,000 C nodes on a single machine
isn't feasible, because of the amount of memory that would require.

A port suffers from the same problem, since the Erlang processes would
be communicating with a single external program, and again, I can't
create 20,000 instances of that program.

Reading the documentation for a linked-in driver, it says:
http://www.erlang.org/doc/tutorial/c_portdriver.html

"Just as with a port program, the port communicates with a Erlang
process. All communication goes through one Erlang process that is the
connected process of the port driver. Terminating this process closes
the port driver."

But on the driver documentation page:
http://www.erlang.org/doc/man/erl_driver.html

"A driver is a library with a set of function that the emulator calls,
in response to Erlang functions and message sending. There may be
multiple instances of a driver, each instance is connected to an
Erlang port. Every port has a port owner process. Communication with
the port is normally done through the port owner process."

So this also seems to have the same problem as C nodes and ports,
since in order to maintain concurrency I would need 20,000 instances
of the same driver.

Finally, we have NIFs. These have potential, but when I read the
documentation:
http://www.erlang.org/doc/man/erl_nif.html

"Avoid doing lengthy work in NIF calls as that may degrade the
responsiveness of the VM. NIFs are called directly by the same
scheduler thread that executed the calling Erlang code. The calling
scheduler will thus be blocked from doing any other work until the NIF
returns."

So if one Erlang process calls a NIF, does this mean the other 19,999
processes are blocked until the NIF returns (or the subset of
processes a scheduler manages)? If so, this won't work either.

Does anyone have a solution to this that still allows you to use C
(I'm using C for the parts that are intensive number crunching)? Or
will I have to implement everything in Erlang?

Thanks!