[erlang-questions] linked in driver port performance

Thu Jul 15 06:01:49 CEST 2010

Yes, that is correct, but there should be a scheduler thread per core,
being used within the Erlang VM.  So, it is the main Erlang thread of
execution, but there should be more than one on a multicore machine
running in smp mode (that detects, or is told, about how many cores
exist).  BIFs like erlang:now() (which hits gettimeofday() in C land)
should have the same constraints as a NIF, so you may already be
blocking a scheduler thread with the tasks you are doing, without
realizing it.

The async thread pool is separate and is shared by all port drivers that
use asynchronous threads (so they compete for all the threads, the count
specified with "+A number") and asynchronous threads do not have a
shared job queue, so it is more efficient, but easy to clog up if you
have long running jobs.  The only port driver that always uses the async
thread pool by default (behavior can be changed with obscure environment
variables) is used by the file module.  I think the crypto application
had code for async port driver usage when using openssl, but last I
checked it wasn't being used (don't remember).

If you are seeing poor performance with the async thread pool, you need
to increase the count of async threads, keeping in mind the job queue is
not shared.  So, that means any long running async thread pool tasks
will delay all later jobs in a non-deterministic way, if the async
thread pool is too small.  That makes a person tempted to give "+A 128"
just to make sure no port driver is clogging an async thread, despite
the person only has 4 possible cores, which already have Erlang VM
scheduler threads running.

Based on the NIF documentation, NIFs should be stable at R14B, so we are
getting really close to NIFs being stable.  Since NIFs are much easier
and more natural in Erlang, I would go with making a NIF.  Most usage of
the async thread pool seems to end in complaints on the mailing list. 
Most people then seem to move on to making their own thread pool outside
the Erlang VM, hopefully in a port (or perhaps a scary port driver). 
The problem then is that the custom thread pool is fighting for control
of CPUs, with the Erlang scheduler threads, and it is relying on the
kernel scheduler, which has no clue about what is being executed... just
that it is code (i.e., not what is important based on message queuing or
Erlang VM priority).  The async thread pool seems to have the same
problem as a custom thread pool.

Hopefully you could avoid the idea of a NIF that has a thread pool, and
just rely on the NIF sending messages, if Erlang code needed to consume
data asynchronously.  If you needed to produce data asynchronously in C,
where the latency was tested to be too great to allow a scheduler thread
to block for the work, you might benefit from an Erlang port with a
possible thread pool.  You could always put your own thread pool into a
NIF, and it is made easier with the provided functions like
"enif_thread_create".  However, the NIF with extra threads would lack
any barrier with the VM, such that any small errors would create new and
exciting crash dumps (if you are lucky).  So, you could create a NIF
with its own thread pool, but you would need to test a lot and really
try to justify your decision, for yourself.

- Michael

On 07/14/2010 07:55 PM, Jarrod Roberson wrote:
> On Wed, Jul 14, 2010 at 9:22 PM, Vinubalaji Gopal <vinubalaji@REDACTED>
> wrote:
>   
>> great so is NIF stable enough and the recommended solution for any kind of
>> interfacing with C/C++?
>>
>> I read in the following link that NIF is good  for simple CPU bound
>> operations and linked in port driver is good for IO  and fine grained
>> concurrency. Is that not true anymore?
>>
>>
>>     
> http://www.erlang-factory.com/upload/presentations/215/ErlangFactorySFBay2010-CliffMoon.pdf
>
> My understanding is NIF(s) are for one shot function calls that will block
> the main Erlang thread.
>
> >From the docs <http://ftp.sunet.se/pub/lang/erlang/doc/man/erl_nif.html>:
>
> "*Avoid doing lengthy work in NIF calls as that may degrade the
> responsiveness of the VM.
> NIFs are called directly by the same scheduler thread that executed the
> calling Erlang code.
> The calling scheduler will thus be blocked from doing any other work until
> the NIF returns*."
>
>