[erlang-questions] : : : driver_entry stop() and driver_async() interaction

Fri Dec 19 15:35:41 CET 2008

> Very interesting. We will certainly have a look at this.
> Does it work on Windows too?

Alas not yet, but I think it should be easy to do.

   - it uses pthreads calls directly (but you now provide an
abstraction, so increasing portability should be fairly
straight-forward, although I won't be doing it myself)

   - it uses driver_select with pipe file-descriptors, so would need a
minor change to use a Windows event handle.

Basically the EDTK thread-pool code looks a lot like a generalization
of the existing Erlang VM async threads code.  The main difference is
that every thread-pool has exactly one producer-consumer queue (rather
than having one such queue per individual thread, as the existing
'async' pool does).   Plus the EDTK implementation has the other
features that I've already mentioned.   (All of those features have
real-world motivations; e.g. setting the stack size is vital if you
want even a moderate number of threads but cannot affort the default 1
MB+ (virtual-memory) stack space per thread that many OSes give you by
default.)

The main thing that the EDTK implementation does not currently do is
allow dynamic creation of entirely new thread-pools (for a given
driver instance) at runtime.  That's because the choice of which
thread-pool to use to execute which (wrapped) library call is entirely
static in the current EDTK implementation.  I haven't yet seen a
use-case for allowing Erlang code to dynamically choose which
thread-pool to use.  In fact, as the particular pattern of thread-pool
use is often critical to avoiding resource starvation or several
classes of deadlock, I think it would probably be quite dangerous to
expose that level of control to applications (certainly that's true
for the berkeley_db driver).

Chris

On Fri, Dec 19, 2008 at 12:07 AM, Raimo Niskanen
<raimo+erlang-questions@REDACTED> wrote:
> On Thu, Dec 18, 2008 at 01:52:10PM -0800, Chris Newcombe wrote:
>> >> I think it would be _great_ if the VM/driver interface provided a more traditional thread
>> >> pool/queue API in addition to the existing async functionality.
>>
>> Agreed.   Some libraries even need more than one thread pool.
>>
>> If you use the latest version of the Erlang Driver Toolkit
>>
>>    http://www.snookles.com/erlang/edtk/
>>
>> ... it allows the driver to use an arbitrary number of private
>> thread-pools (e.g. the Berkeley DB driver required 5 separate thread
>> pools to avoid potential thread-level deadlock).
>>
>> Also, for full runtime visibility and control, the EDTK thread-pools
>> can be examined and re-sized at runtime (i.e. increase the number of
>> threads independently, per thread pool).   You can also set the stack
>> size for the threads, and set a limit on the length of the
>> command-queue for each thread pool (important for flow control).
>>
>> EDTK now provides these features for all generated drivers -- it is
>> trivial to set up.  See the declarations at top of
>> examples/berkeley_db/berkeley_db.xml for an example.
>>
>> If the Erlang VM does ever provide a better threadpool facility, it
>> would be great if it had these same features flexibility (you could
>> adapt the code from EDTK, as it is BSD-licensed and has been well
>> tested).   The fact that the BerkeleyDB driver needed these features
>> is an existance-proof of their value.
>
> Very interesting. We will certainly have a look at this.
> Does it work on Windows too?
>
> What I had in mind to add was a thinner layer on pthreads
> or whatever the emulator is using, so drivers could manage
> threads on their own.
>
>
>>
>> Chris
>>
>> Example snippets from the berkeley_db driver test suite:
>>
>>     ok = ?BDB:set_threadpool_params(BdbPort,
>> ?BDB_THREADPOOL_ID_WORKERS, 5, 64*1024, 2000),
>>
>>     [{edtk_threadpool_id,         ?BDB_THREADPOOL_ID_WORKERS},
>>      {curr_queue_len,             0},
>>      {curr_idle_threads,          30},
>>      {num_threads,                30},
>>      {curr_enqueued_poison_pills, 0},
>>      {queue_len_limit,            1000}] =
>> ?BDB:get_threadpool_info(BdbPort, ?BDB_THREADPOOL_ID_WORKERS),
>>
>> Chris
>>
>>
>> On Thu, Dec 18, 2008 at 7:55 AM, Raimo Niskanen
>> <raimo+erlang-questions@REDACTED> wrote:
>> > On Thu, Dec 18, 2008 at 08:08:06AM -0700, Dave Smith wrote:
>> >> One other note...I eventually wrote my own thread pool/queue for load
>> >> balancing purposes. async_* can cause the load to be very spiky if you
>> >> can not predict the amount of time a queued operation might take
>> >> (which is usually the point of using threads...). I think it would be
>> >> _great_ if the VM/driver interface provided a more traditional thread
>> >> pool/queue API in addition to the existing async functionality. I know
>> >> of several Erlang drivers that have had to write their own and it can
>> >> be a very challenging problem to get right. :)
>> >
>> > Aggreed. It has been on the Future Plans list forever.
>> > Using async thread for e.g prime number calculations
>> > will today block random file operations...
>> >
>> >>
>> >> D.
>> >>
>> >> On Thu, Dec 18, 2008 at 7:36 AM, Dave Smith <dizzyd@REDACTED> wrote:
>> >> >>>>> Is it the responsibility of the code in the stop() callback to call
>> >> >>>>> driver_async_cancel() on each outstanding async work item, or will this
>> >> >>>>> be done automatically by the emulator before the call to stop()?
>> >> >
>> >> > Yes, it is up to the driver to cancel outstanding items. In wrestling
>> >> > with this recently, I found that the best approach was to cancel all
>> >> > the items and then queue up a sentinel callback that triggers a
>> >> > condition variable. I then wait for that cv to get signaled and know
>> >> > at that point that the driver is clear to shutdown. This approach
>> >> > seemed to work quite well on a 8 proc box with reasonable load --
>> >> > YMMV. :)
>> >> >
>> >> >>>>> If this is the responsibility of the code in stop(), is it guaranteed
>> >> >>>>> that no async work item will be executing or scheduled during the call
>> >> >>>>> to the stop() callback?
>> >> >>>>>
>> >> >>>>> If no guarantee is made, is holding the PDL necessary and sufficient to
>> >> >>>>> guarantee this?
>> >> >
>> >> > I researched this, and I believe that the answer is "no". Scheduling
>> >> > the async work item does increment the PDL ref count when the item is
>> >> > _inserted_ into the queue. Once the item is in the queue, all bets are
>> >> > off -- the only guarantee you have is that the PDL won't disappear
>> >> > (i.e. you can safely lock/unlock it). The PDL says nothing about
>> >> > whether or not the async work item is executing during stop(). It
>> >> > seems to be the responsibility of the driver author to sort out these
>> >> > intricate timing issues.
>> >> >
>> >> > erts/emulator/beam/erl_async.c is where all the code of interest resides.
>> >> >
>> >> > It's entirely possible that I have made gross errors in my reading of
>> >> > the emulator code and would happily receive instruction from one of
>> >> > the core VM team. :)
>> >> >
>> >> > D.
>> >> >
>> >> _______________________________________________
>> >> erlang-questions mailing list
>> >> erlang-questions@REDACTED
>> >> http://www.erlang.org/mailman/listinfo/erlang-questions
>> >
>> > --
>> >
>> > / Raimo Niskanen, Erlang/OTP, Ericsson AB
>> > _______________________________________________
>> > erlang-questions mailing list
>> > erlang-questions@REDACTED
>> > http://www.erlang.org/mailman/listinfo/erlang-questions
>> >
>> _______________________________________________
>> erlang-questions mailing list
>> erlang-questions@REDACTED
>> http://www.erlang.org/mailman/listinfo/erlang-questions
>
> --
>
> / Raimo Niskanen, Erlang/OTP, Ericsson AB
>