[erlang-questions] NIF appropriateness, was: Re: Messing with heart. Port and NIF, which one is better?

Thu Feb 14 11:52:17 CET 2013

On 02/14/2013 02:30 AM, Scott Lystig Fritchie wrote:
> I'm starting a new'ish thread to mention a bit of experience that Basho
> has had with NIFs in Riak.
>
> Garrett Smith <g@REDACTED> wrote:
>
>>> And the second question, Is there any good argument to use NIF
>>> instead of creating a connected process for a port.
> gs> The NIF interface is appropriate for defining simple functions in C.
> gs> There are lots of 3rd party libraries where NIFs are used to plugin
> gs> in long running, multi-threaded facilities, but this seems misguided
> gs> to me.
>
> "Simple functions in C" is a tricky matter ... and it has gotten tricker
> with the Erlang/OTP releases R15 and R16.
>
> In R14 and earlier, it wasn't necessarily a horrible thing if you had C
> code (or C++ or Fortran or ...) that executed in NIF context for half a
> second or more.  If your NIF was executing for that long, you knew that
> you were interfering with the Erlang scheduler Pthread that was
> executing your NIF's C/C++/Fortran/whatever code.  That can cause some
> weird delays in executing other Erlang processes, but for some apps,
> that's OK.
>
> However, with R15, the internal guts of the Erlang process scheduler
> Pthreads has changed.  Now, if you have a NIF that executes for even a
> few milliseconds, the scheduler algorithm can get confused.  Instead of
> blocking an Erlang scheduler Pthread, you both block that Pthread *and*
> you might cause some other scheduler Pthreads to decide incorrectly to
> go to sleep (because there aren't enough runnable Erlang processes to
> bother trying to schedule).  Your 8/16/24 CPU core box can find itself
> down to only 3 or 2 active Erlang scheduler Pthreads when there really
> is more than 2-3 cores of work waiting.
>
> So, suddenly your "simple functions in C" are now "simple functions in C
> that must finish execution in about 1 millisecond or less".  If your C
> code might take longer than that, then you must use some kind of thread
> pool to transfer the long-running work away from the Erlang scheduler
> Pthread.  Not simple at all, alas.
>
> -Scott
>
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://erlang.org/mailman/listinfo/erlang-questions

These problems are what NIF native processes will solve, right?  The only other alternative would be to use the async thread pool within a port driver, which may not help the schedulers and is obsoleted by native processes (not to mention the job queue per thread situation which can block on long jobs).