[erlang-questions] NIF appropriateness, was: Re: Messing with heart. Port and NIF, which one is better?

Thu Feb 14 20:28:36 CET 2013

On 02/14/2013 09:21 AM, Garrett Smith wrote:
> On Thu, Feb 14, 2013 at 4:30 AM, Scott Lystig Fritchie
> <fritchie@REDACTED> wrote:
>> I'm starting a new'ish thread to mention a bit of experience that Basho
>> has had with NIFs in Riak.
>>
>> Garrett Smith <g@REDACTED> wrote:
>>
>>>> And the second question, Is there any good argument to use NIF
>>>> instead of creating a connected process for a port.
>> gs> The NIF interface is appropriate for defining simple functions in C.
>> gs> There are lots of 3rd party libraries where NIFs are used to plugin
>> gs> in long running, multi-threaded facilities, but this seems misguided
>> gs> to me.
>>
>> "Simple functions in C" is a tricky matter ... and it has gotten tricker
>> with the Erlang/OTP releases R15 and R16.
>>
>> In R14 and earlier, it wasn't necessarily a horrible thing if you had C
>> code (or C++ or Fortran or ...) that executed in NIF context for half a
>> second or more.  If your NIF was executing for that long, you knew that
>> you were interfering with the Erlang scheduler Pthread that was
>> executing your NIF's C/C++/Fortran/whatever code.  That can cause some
>> weird delays in executing other Erlang processes, but for some apps,
>> that's OK.
>>
>> However, with R15, the internal guts of the Erlang process scheduler
>> Pthreads has changed.  Now, if you have a NIF that executes for even a
>> few milliseconds, the scheduler algorithm can get confused.  Instead of
>> blocking an Erlang scheduler Pthread, you both block that Pthread *and*
>> you might cause some other scheduler Pthreads to decide incorrectly to
>> go to sleep (because there aren't enough runnable Erlang processes to
>> bother trying to schedule).  Your 8/16/24 CPU core box can find itself
>> down to only 3 or 2 active Erlang scheduler Pthreads when there really
>> is more than 2-3 cores of work waiting.
>>
>> So, suddenly your "simple functions in C" are now "simple functions in C
>> that must finish execution in about 1 millisecond or less".  If your C
>> code might take longer than that, then you must use some kind of thread
>> pool to transfer the long-running work away from the Erlang scheduler
>> Pthread.  Not simple at all, alas.
> Thanks for highlighting this Scott.
>
> Sean Cribbs went into some of these details last night at the Chicago
> Riak meetup.
>
> I imagine this has serious implications for the 0MQ bindings, which
> are NIF implemented. I'm currently running everything under R14, so am
> apparently insulated, but this overall sounds quite bad.
>
> Have you seen this behavior in port drivers?
>
> Garrett
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://erlang.org/mailman/listinfo/erlang-questions
>
The erlzmq2 NIF uses a separate thread for the receive and enif_send is used to provide the incoming data, with locks inbetween.  So, I don't see why the impact would be serious, just since a separate thread is used, while the NIF functions do not block in the C code.