Multithreaded Drivers

Fri Aug 3 10:05:29 CEST 2001

I managed to send this message to Sean only, missed the list, and
finally now fixes that mistake...

/ Raimo Niskanen, Erlang/OTP, Ericsson UAB.

------------------------------------------------------------------------------------------

Sorry for the delay, Sean, I just came back from my vacation. Answers
inserted below...

Sean Hinde wrote:
> 
> All,
> 
> I'm digging through the existing drivers trying to figure out how they
> work.. Particularly with regard to multithreaded drivers.
> 
> So far I have figured out from studying efile_drv that:
> 
> The start function should return a pointer to the struct which holds all the
> state for the thread. This is so that each call to open_port will generate
> it's own state which is passed to whichever thread happens to be used to
> process the call. This pointer is cast to a long to keep the type system
> happy :)
> 

The start function and all driver callback functions are called in the
same thread context, i.e the emulator main thread. The emulator is using
only one thread, the main thread, and emulates all erlang processes in
that thread. This is to avoid platform dependant and time consuming
thread locking between erlang processes.

The multithread interface that efile_drv uses is the only way to execute
in other threads from within the emulator (expect for some obscure
support threads), and the only code that executes in other threads are
the callback (worker) functions given to driver_async(). Therefore all
thread specific data must be accessible through the data pointer also
given to driver_async().

> Q. Can I use the new erl_driver.h functions with multithreaded drivers (I
> notice that not all the functions in driver.h are included in the new one,
> particularly driver_async)?
> 

We have tried for a while to get rid of driver.h, and for R8 it finally
seems to happen; erl_driver.h will then (probably) contain all functions
from driver.h.

> Q. I presume if so I can just cast the pointer to my state into a
> ErlDrvData?
> 

Yes.

> The start function in efile_drv uses the function sys_alloc_from(200,
> sizeof(file_descriptor_state_struct)). I can't figure out the point of the
> 200, or how to decide what it should be in my case. As far as I can make out
> it only has an effect in an instrumented system (from looking at the def in
> sys.h).
> 

Correct, it is used for an instrumented system. You can look in the
source code for instrument.erl in application 'tools' to see the values
that are defined today.

> Q. What is the approved way to assign and clear memory in this case and in
> drivers in general?
> 

To allocate and free memory please use driver_alloc(), driver_realloc()
and driver_free() in erl_driver.h. Also, driver_alloc_binary(),
driver_realloc_binary() and driver_free_binary() may be useful.

> Q. Under what circumstances is the free function passed to driver_async
> called (I couldn't find any from a quick look)?
> 

When an async request is cancelled with driver_async_cancel(), and the
request has not started to run yet. Either the free function or the
driver callback async_ready is called.

When a port is closing while having unfinished async requests. The
driver callbacks cannot be called since the port is regarded as closed.

> Q. What is the key parameter intended for and do I ever need to worry about
> it?
> 

The key parameter is used to select thread. If it is NULL, as for
efile_drv, the requests are round robin scheduled on the threads in the
async thread pool. Otherwise it must be a pointer to an integer that is
used to select thread through a simple hash function (modulo number of
threads).

E.g efile_drv may have to worry about this when (if) it becomes possible
to access a file from different erlang processes. Requests to one
specific file must then be handled by the same thread, or else a read
from one erlang process could run ahead of a write from another erlang
process by using a different thread. Therefore efile_drv must use maybe
the file descriptor or perhaps the port number as key, it must be an
integer unique for the file.

> I've got the hang of the idea that the main function driven by
> port_command/2 should just schedule the work to be done in a thread sometime
> later with the driver_async function, and that when the thread has done it's
> work for this time it puts itself into the queue to call the async_ready
> callback which sends the result back to erlang. This looks fine..
> 
> Q. What happens if I send another port_command/2 to the port before the
> result of the possibly time consuming last operation has returned?
> 

This is entirely up to your interface between the erlang process and the
driver. You decide if the erlang process should wait for a message from
the port after each port_command/2 or not. 

Messages are sent asynchronously from the erlang process to the port
with port_command/2, and the other way (also asynchronously) with
driver_output*(). Note that set_busy_port() can be used to suspend
future port_command/2. Synchronous calls from erlang processes to the
port are made with port_control/3.

> Q. I wonder if I can use this to call the driver_cancel function sometime
> later to abort the operation?
> 

Yes you can, until the request has started to run on a thread in the
thread pool. The function driver_async_cancel() returns 1 if the request
was cancelled, and 0 if the request could not be found in the queues
(might be running, or has alread run, or invalid async_id).

> Q. What is the best way to timeout the operation which is being carried out
> in a thread so I can return the resource to useful state (in case it is in
> an endless loop, or the far end of some link has died leaving the thread
> waiting forever..)?
> 

While the thread is running it cannot be cancelled. This also blocks all
async requests that may be scheduled on the same thread later. All async
requests must be reasonably short, or else they will slow down e.g
efile_drv. It is best if the worker function itself can limit its
execution time.

It might be possible to use the erts_mutex*() and erst_cond*() functions
in some way, but the water is getting deep here...

>
>
>

Hope this helps you forward.

/ Raimo Niskanen, Erlang/OTP, Ericsson UAB.