File locking & device drivers

Wed May 15 12:18:50 CEST 2002

Leonid Timochouk wrote:
> 
> Hello Erlang users,
> 
> I am trying to implement a file locking capability (an interface to the
> "flock" system call) in Erlang. Unfortunately, the "file" module does not
> provide it straight away.
> 
> My first approach was to build a linked-in device driver, and it works
> almost fine (it uses "flock" in non-blocking mode in order not to block
> the whole Erlang node). There is a problem, however: it communicates with
> the Erlang code by sending messages, and the driver responses come into
> the same main message queue of the calling process. Now, suppose we have a
> process which reads incoming data from its message queue and writes them
> into a file, locking the file each time. Then the whole message queue
> would need to be traversed each time by the run-time system when we try to
> fetch the driver responses, so saving messages in a file has QUADRATIC
> rather than linear complexity w.r.t. the length of the incoming queue.
> This is a major inefficiency.
> 

Is this not more of a general flow control problem. Asynchronous message
passing is THE way to synchronize processes in Erlang. If you have a
server that passes messages to somewhere (a file while guarding all
writes with 'flock') and many clients just feeding the server (no
request/reply message passing), the server message queue might grow in
this case affecting the server's ability to serve requests since these
client messages gets in the way of more important messages.

There is a mechanism in the Erlang emulator (virtual machine) that
compensates for this problem. A process that sends a message to another
process with a large message queue gets punished with getting sheduled
out sooner. This mechanism diminsishes this producer/consumer problem.

Another way to get efficient data dumping to a file would be to write a
driver that does almost all of the work. I think it was this track Ulf
Wiger was on in his earlier reply. If all producers (clients) would
write to a port (with a registered name), the port could queue the
requests as it wished, or just call 'set_busy_port(port, !0)' when it
has enough buffered for the moment.

Any process writing to a port that is busy gets suspended in the write
('erlang:port_command(Port, Data)'), so no receive queue would have to
be examined for a response.

> Would modifying the built-in "efile" driver be a better solution? If so,
> should "flock" be implemented via "file_output" or "file_outputv"? Why is,
> for example, "close" implemented as a "file_outputv" operation, although
> it has no "vector" data to deal with? Any guidance to "efile" driver
> internals will be very much appreciated.
> 

You would probably not gain any performance by modifying the 'efile'
driver, since it cannot do anything better than your driver.

All operations should be implemented in 'file_outputv' just because it
exists. The ones in 'file_output' is just waiting for the driver
responsible to get time to move them over.

These functions ('file_outputv' and 'file_output') are part of the
interface between the emulator and the driver, and since the outputv
variant exists it is always called. 'file_outputv' flattens the buffer
vector and calls 'file_output' itself when for the requests not yet
moved over (slightly inefficient).

The function 'file_outputv' gets vectorised data from the emulator,
which is most beneficial for the writev and readv operations.

/ Raimo Niskanen, Erlang/OTP, Ericsson AB