New proposal: driver objects (in same track as driver_mkref)
Scott Lystig Fritchie
fritchie@REDACTED
Wed Jul 31 07:40:32 CEST 2002
>>>>> "sp" == Shawn Pearce <spearce@REDACTED> writes:
sp> After writing this suggestion, I'm wondering if its even worth
sp> wasting the network bytes to transfer it to the mailing list, let
sp> alone actually implementing. Might as well throw it out there for
sp> discussion though.
You're right. That's a pretty far-out idea. :-) But the discussion
is worthwhile, particularly since I have something relevant to add.
sp> But what about the dangers of drivers returning pointers (and
sp> other driver private information) to Erlang as integers or
sp> binaries? These terms are easy to "manipulate" in Erlang,
sp> allowing an Erlang application to potentially send a bad value to
sp> the driver. So the driver author must code somewhat defensively.
I've come up with a solution that requires no change to the VM and
does most of what I think you're proposing. I'll yank an example out
of the Erlang Workshop paper I've been working on.
(THANK GOD I DON'T HAVE TO HAVE THAT PAPER FINISHED THIS WEEK!!!
Alright, I think I have that out of my system now.)
The "Erlang Driver Toolkit" code generator I've been working on
automatically generates all or almost all of the code a person needs
to create a driver for an existing C library. I quote an example from
the SWIG (http://www.swig.org/) documentation, then show how it can be
done using an EDTK-generated driver.
Assume that pli2002_drv provides a straightforward interface to
malloc(3), fopen(3), fread(3), and fwrite(3). Ignore the fact that
fwrite doesn't take the same number of arguments as fread: that's
intentional, explained in the paper, so nevermind.
-module(pli2002_test).
-define(DRV, pli2002_drv).
-define(BUFSIZ, 8192).
-export([file_copy/1, file_copy/2]).
%% command line "-s" usage
file_copy([Src, Dst])
when atom(Src), atom(Dst) ->
file_copy(atom_to_list(Src),
atom_to_list(Dst)).
file_copy(Src, Dst) ->
{ok, Port} = ?DRV:start(),
{ok, SrcF} = ?DRV:fopen(Port, Src, "r"),
{ok, DstF} = ?DRV:fopen(Port, Dst, "w"),
{ok, Buf} = ?DRV:malloc(Port, ?BUFSIZ),
RFun = fun () ->
?DRV:fread(Port, Buf, 1, ?BUFSIZ, SrcF) end,
WFun = fun (N) ->
?DRV:fwrite(Port, Buf, N, DstF) end,
Val = file_copy2(RFun, WFun),
%% Shutdown will automatically close
%% files and free the buffer.
?DRV:shutdown(Port),
Val.
file_copy2(RFun, WFun) ->
file_copy2(RFun, WFun, RFun()).
file_copy2(RFun, WFun, {ok, N}) ->
WFun(N),
file_copy2(RFun, WFun);
file_copy2(RFun, WFun, {error, 0}) ->
ok; % End of file
file_copy2(RFun, WFun, Error) ->
Error.
In the section where I talk about the hassles/joys of
single-assignment, I point to how this function written in Python is
OK but problematic for Erlang.
The return of pli2002_drv:malloc/2 is not a pointer cast as an integer
(though you can tell EDTK to do that for you) or hidden inside a
binary (though I'm tempted by that). Instead it's what I've called
a "value map" tuple. It looks like {valmap_ptr, 5}. The "5" is an
index into a private value map table maintained by the driver. The
index is what gets passed into the driver: here's free/2:
free(Port,
Ptr
) when port(Port) -> % TODO: Add additional constraints here
{valmap_ptr, PtrIndex} = Ptr,
B = <<?_FREE,
PtrIndex:32/integer
>>,
erlang:port_command(Port, B),
get_port_reply(Port).
The pattern matching extracts the index value, but it has the side
benefit of causing an error if the caller passed in something else.(*)
It's not foolproof, as I also discuss (I think?), but it works pretty
well.
Anyway, the driver can hide whatever it wants behind these value map
thingies. I have example drivers that use store malloc'ed pointers as
well as file descriptors and Berkeley DB database and environment
"handles".
The driver's "stop" method calls the user's deallocation/close/cleanup
function for each entry in each value map whenever the port is closed:
'cause port_close/1 was called or because the owner process died/was
killed. As the comment in file_copy/2 says, closing the port will
close both "FILE *" thingies (and their file descriptors and free any
additional buffers they were hiding) as well as free the malloc'ed
memory.
IMHO it's pretty slick. It makes it almost impossible for a driver to
leak resources, unless the owner process allocates stuff and then
lives forever. But then you've got bigger problems.
If that's the sort of sanity mechanism you're proposing, I agree, it's
very useful.
-Scott
(*) I borrowed this bit of type checking from SWIG, which encodes its
pointers as something like (fuzzy memory) "_0x01234f30_FILE_": it
includes both the pointer address and the pointer type.
More information about the erlang-questions
mailing list