[erlang-questions] NIF appropriateness, was: Re: Messing with heart. Port and NIF, which one is better?
Thu Feb 14 18:21:05 CET 2013
On Thu, Feb 14, 2013 at 4:30 AM, Scott Lystig Fritchie
> I'm starting a new'ish thread to mention a bit of experience that Basho
> has had with NIFs in Riak.
> Garrett Smith <> wrote:
>>> And the second question, Is there any good argument to use NIF
>>> instead of creating a connected process for a port.
> gs> The NIF interface is appropriate for defining simple functions in C.
> gs> There are lots of 3rd party libraries where NIFs are used to plugin
> gs> in long running, multi-threaded facilities, but this seems misguided
> gs> to me.
> "Simple functions in C" is a tricky matter ... and it has gotten tricker
> with the Erlang/OTP releases R15 and R16.
> In R14 and earlier, it wasn't necessarily a horrible thing if you had C
> code (or C++ or Fortran or ...) that executed in NIF context for half a
> second or more. If your NIF was executing for that long, you knew that
> you were interfering with the Erlang scheduler Pthread that was
> executing your NIF's C/C++/Fortran/whatever code. That can cause some
> weird delays in executing other Erlang processes, but for some apps,
> that's OK.
> However, with R15, the internal guts of the Erlang process scheduler
> Pthreads has changed. Now, if you have a NIF that executes for even a
> few milliseconds, the scheduler algorithm can get confused. Instead of
> blocking an Erlang scheduler Pthread, you both block that Pthread *and*
> you might cause some other scheduler Pthreads to decide incorrectly to
> go to sleep (because there aren't enough runnable Erlang processes to
> bother trying to schedule). Your 8/16/24 CPU core box can find itself
> down to only 3 or 2 active Erlang scheduler Pthreads when there really
> is more than 2-3 cores of work waiting.
> So, suddenly your "simple functions in C" are now "simple functions in C
> that must finish execution in about 1 millisecond or less". If your C
> code might take longer than that, then you must use some kind of thread
> pool to transfer the long-running work away from the Erlang scheduler
> Pthread. Not simple at all, alas.
Thanks for highlighting this Scott.
Sean Cribbs went into some of these details last night at the Chicago
I imagine this has serious implications for the 0MQ bindings, which
are NIF implemented. I'm currently running everything under R14, so am
apparently insulated, but this overall sounds quite bad.
Have you seen this behavior in port drivers?
More information about the erlang-questions