[erlang-questions] : : : driver_entry stop() and driver_async() interaction

Fri Dec 19 14:52:20 CET 2008

Here are the results from the internal meeting...

On Thu, Dec 18, 2008 at 04:54:26PM +0100, Raimo Niskanen wrote:
> On Thu, Dec 18, 2008 at 06:41:31AM -0600, Paul Fisher wrote:
> > Any word on this?
> 
> Sorry, no. I will try to summon an internal meeting about this...
> 
> > 
> > 
> > Raimo Niskanen wrote:
> > > On Thu, Nov 06, 2008 at 12:22:45PM -0600, Paul Fisher wrote:
> > >> Can anyone comment on this question i sent a few weeks ago?
> > >>
> > >> When the port owner terminates and there are driver_async() requests 
> > >> scheduled and not yet executed, I am seeing them outstanding at the time 
> > >> of the stop() callback.
> > > 
> > > We will take a look at it and make it clear in some
> > > documentation and give you an answer. It will take a bit of
> > > code review to be certain...
> > > 
> > >>
> > >> Paul Fisher wrote:
> > >>> Question about work scheduled on async threads via driver_async() and
> > >>> still pending (i.e. they are still on the queue to be executed,) when
> > >>> the stop() callback is invoked.  Specifically, assuming smp w/async
> > >>> thread pool and the driver marked ERL_DRV_FLAG_USE_PORT_LOCKING.
> > >>>
> > >>> Is it the responsibility of the code in the stop() callback to call
> > >>> driver_async_cancel() on each outstanding async work item, or will this
> > >>> be done automatically by the emulator before the call to stop()?

The emulator will not cancel the jobs. You may cancel
async jobs from the stop() callback, but running 
jobs will keep on running and fail to be cancelled.

The trick Dave Smith described in another post works (apparently). 
In the stop() callback, cancel all async jobs. Some may not
cancel since they are already running. Then start a new
special async job that on invocation triggers a condition
variable that the stop() callback waits for after starting
that special async job. In that way stop() finishes after
all async jobs are done, and when stop() exits the port
dies. Period. This of course blocks the scheduler thread
executing stop() until all async jobs are done.

But leaving async jobs behind after stop() has exited is
not supposed to be wrong. After the async job is done
either the async_ready() callback is executed if it
exists, or the async_free() async callback is executed.
The async_free() async callback is also executed
if the async job is cancelled. Unfortunately there is
currently a bug so the async_free() async callback
is not executed if the port dies before async_ready()
is supposed to be executed causing a memory leak in
this case.

After this bug has been fixed you may (supposedly) just
leave the async jobs around after stop() callback and 
their async_free() async callback will be executed
to free their allocated memory, if this is fine
with your application.

Another trick is to keep data in the IO queue. If there
is data in the IO queue when the port is killed, the
flush() callback is executed to try to force data
out of the queue. The flush() callback does not have
to succeed in flushing the data. It is more or less
an informative callback. When the last data in
the IO queue is consumed through driver_deq(),
the stop() callback will be executed. If you use
the IO queue to communicate with the async jobs
you can make it so they automatically will have to
be done before the stop() callback is executed.
For some applications this is natural and for
others it is a hack.

> > >>>
> > >>> If this is the responsibility of the code in stop(), is it guaranteed
> > >>> that no async work item will be executing or scheduled during the call
> > >>> to the stop() callback?

There are no such guarantees. Async jobs are supposed to go
in parallell with other port callbacks. If you use the
IO queue from other threads than the scheduler threads
(regular callbacks), e.g from async jobs, you must protect
accesses to it using the port data lock (PDL). Just as if you use
other data common to the async thread and the driver callbacks
it must also be lock protected. You can use the PDL for this
or a lock of your own.

> > >>>
> > >>> If no guarantee is made, is holding the PDL necessary and sufficient to
> > >>> guarantee this?

No, holding the PDL only synchronises accesses of the IO queue.
There may come other emulator port data that also will be
protected by this lock.

> > >>
> > >> --
> > >> paul
> > >> _______________________________________________
> > >> erlang-questions mailing list
> > >> erlang-questions@REDACTED
> > >> http://www.erlang.org/mailman/listinfo/erlang-questions
> > > 
> > 
> > _______________________________________________
> > erlang-questions mailing list
> > erlang-questions@REDACTED
> > http://www.erlang.org/mailman/listinfo/erlang-questions
> 
> -- 
> 
> / Raimo Niskanen, Erlang/OTP, Ericsson AB
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://www.erlang.org/mailman/listinfo/erlang-questions

-- 

/ Raimo Niskanen, Erlang/OTP, Ericsson AB