[erlang-questions] Keeping massive concurrency when interfacing with C

Mon Oct 3 13:42:05 CEST 2011

John, 

When I see a number like 20k processes my mind automatically skips to what is the load per process?

Do I need a full CPU for each, or a fractional CPU, or maybe I need 20k cores at peak?

If the issue is you have 20k anything that need to be scheduled, you need 20k x ( process time + switching cost) / process. Hardware architecture determines a lot about both. On Linux you can get away with hundreds of native processes on Intel but you may not be able to eek out enough processing time per core to do useful work. 

If you are running on multicore ARM you can eek out maybe 10 processes per core, before just the switching cost alone kills your CPU. This is why we are seeing a move towards 128 core and 256 core ARM processors. If you need 20k cores and can afford around $32-64k in hardware, there are a couple companies that will have products shipping next year. 

Generally my preferred solution to this problem is event driven C talking over a socket connection to Erlang. Using kqueue or epoll you can easily handle a few thousand socket connections per core on the C side, and Erlang can easily scale out as a command and control infrastructure. 

If you are clever using consistent hashing to manage system memory across your nodes and job scheduling can make a handful of cores (24ish) perform like a 20k node cluster. 

But the specifics of your project and budget will determine if it is even possible :)

Dave

-=-=- dave@REDACTED -=-=-

On Oct 2, 2011, at 5:10 PM, John Smith <emailregaccount@REDACTED> wrote:

> Hi everyone,
> 
> From my understanding, there are four main ways to interface Erlang
> with C:
> 
> * C Node
> * Port
> * Linked-In Driver
> * NIF (Native Implemented Function)
> 
> My problem is if I have, for example, spawned 20,000 Erlang processes
> and I want them all to execute concurrently but they need to call C
> code, how can I have that C code run concurrently without having to
> spawn 20,000 threads in C (which would probably crash the OS) or using
> obscene amounts of memory?
> 
> I've been reading over the examples of a C node, and it seems if
> 20,000 processes all send a message to the C node, the node will
> process them one-by-one and not concurrently, so it becomes a
> serialized bottleneck. Spawning 20,000 C nodes on a single machine
> isn't feasible, because of the amount of memory that would require.
> 
> A port suffers from the same problem, since the Erlang processes would
> be communicating with a single external program, and again, I can't
> create 20,000 instances of that program.
> 
> Reading the documentation for a linked-in driver, it says:
> http://www.erlang.org/doc/tutorial/c_portdriver.html
> 
> "Just as with a port program, the port communicates with a Erlang
> process. All communication goes through one Erlang process that is the
> connected process of the port driver. Terminating this process closes
> the port driver."
> 
> But on the driver documentation page:
> http://www.erlang.org/doc/man/erl_driver.html
> 
> "A driver is a library with a set of function that the emulator calls,
> in response to Erlang functions and message sending. There may be
> multiple instances of a driver, each instance is connected to an
> Erlang port. Every port has a port owner process. Communication with
> the port is normally done through the port owner process."
> 
> So this also seems to have the same problem as C nodes and ports,
> since in order to maintain concurrency I would need 20,000 instances
> of the same driver.
> 
> Finally, we have NIFs. These have potential, but when I read the
> documentation:
> http://www.erlang.org/doc/man/erl_nif.html
> 
> "Avoid doing lengthy work in NIF calls as that may degrade the
> responsiveness of the VM. NIFs are called directly by the same
> scheduler thread that executed the calling Erlang code. The calling
> scheduler will thus be blocked from doing any other work until the NIF
> returns."
> 
> So if one Erlang process calls a NIF, does this mean the other 19,999
> processes are blocked until the NIF returns (or the subset of
> processes a scheduler manages)? If so, this won't work either.
> 
> Does anyone have a solution to this that still allows you to use C
> (I'm using C for the parts that are intensive number crunching)? Or
> will I have to implement everything in Erlang?
> 
> Thanks!
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://erlang.org/mailman/listinfo/erlang-questions