New proposal: driver objects (in same track as driver_mkref)

Scott Lystig Fritchie fritchie@REDACTED
Wed Jul 31 07:40:32 CEST 2002


>>>>> "sp" == Shawn Pearce <spearce@REDACTED> writes:

sp> After writing this suggestion, I'm wondering if its even worth
sp> wasting the network bytes to transfer it to the mailing list, let
sp> alone actually implementing.  Might as well throw it out there for
sp> discussion though.

You're right.  That's a pretty far-out idea.  :-)  But the discussion
is worthwhile, particularly since I have something relevant to add.

sp> But what about the dangers of drivers returning pointers (and
sp> other driver private information) to Erlang as integers or
sp> binaries?  These terms are easy to "manipulate" in Erlang,
sp> allowing an Erlang application to potentially send a bad value to
sp> the driver.  So the driver author must code somewhat defensively.

I've come up with a solution that requires no change to the VM and
does most of what I think you're proposing.  I'll yank an example out
of the Erlang Workshop paper I've been working on.

(THANK GOD I DON'T HAVE TO HAVE THAT PAPER FINISHED THIS WEEK!!!
 Alright, I think I have that out of my system now.)

The "Erlang Driver Toolkit" code generator I've been working on
automatically generates all or almost all of the code a person needs
to create a driver for an existing C library.  I quote an example from
the SWIG (http://www.swig.org/) documentation, then show how it can be
done using an EDTK-generated driver.

Assume that pli2002_drv provides a straightforward interface to
malloc(3), fopen(3), fread(3), and fwrite(3).  Ignore the fact that
fwrite doesn't take the same number of arguments as fread: that's
intentional, explained in the paper, so nevermind.

    -module(pli2002_test).
    
    -define(DRV, pli2002_drv).
    -define(BUFSIZ, 8192).
    
    -export([file_copy/1, file_copy/2]).
    
    %% command line "-s" usage
    file_copy([Src, Dst])
      when atom(Src), atom(Dst) ->
        file_copy(atom_to_list(Src), 
                  atom_to_list(Dst)).
    
    file_copy(Src, Dst) ->
        {ok, Port} = ?DRV:start(),
        {ok, SrcF} = ?DRV:fopen(Port, Src, "r"),
        {ok, DstF} = ?DRV:fopen(Port, Dst, "w"),
        {ok, Buf} = ?DRV:malloc(Port, ?BUFSIZ),
        RFun = fun () -> 
            ?DRV:fread(Port, Buf, 1, ?BUFSIZ, SrcF) end,
        WFun = fun (N) ->
            ?DRV:fwrite(Port, Buf, N, DstF) end,
        Val = file_copy2(RFun, WFun),
        %% Shutdown will automatically close
        %% files and free the buffer.
        ?DRV:shutdown(Port),
        Val.
    
    file_copy2(RFun, WFun) ->
        file_copy2(RFun, WFun, RFun()).
    file_copy2(RFun, WFun, {ok, N}) ->
        WFun(N),
        file_copy2(RFun, WFun);
    file_copy2(RFun, WFun, {error, 0}) ->
        ok;                 % End of file
    file_copy2(RFun, WFun, Error) ->
        Error.

In the section where I talk about the hassles/joys of
single-assignment, I point to how this function written in Python is
OK but problematic for Erlang.

The return of pli2002_drv:malloc/2 is not a pointer cast as an integer
(though you can tell EDTK to do that for you) or hidden inside a
binary (though I'm tempted by that).  Instead it's what I've called
a "value map" tuple.  It looks like {valmap_ptr, 5}.  The "5" is an
index into a private value map table maintained by the driver.  The
index is what gets passed into the driver: here's free/2:

    free(Port,
         Ptr
            ) when port(Port) -> % TODO: Add additional constraints here
        {valmap_ptr, PtrIndex} = Ptr,
        B = <<?_FREE,
                PtrIndex:32/integer
            >>,
        erlang:port_command(Port, B),
        get_port_reply(Port).

The pattern matching extracts the index value, but it has the side
benefit of causing an error if the caller passed in something else.(*)
It's not foolproof, as I also discuss (I think?), but it works pretty
well.

Anyway, the driver can hide whatever it wants behind these value map
thingies.  I have example drivers that use store malloc'ed pointers as
well as file descriptors and Berkeley DB database and environment
"handles".

The driver's "stop" method calls the user's deallocation/close/cleanup
function for each entry in each value map whenever the port is closed:
'cause port_close/1 was called or because the owner process died/was
killed.  As the comment in file_copy/2 says, closing the port will
close both "FILE *" thingies (and their file descriptors and free any
additional buffers they were hiding) as well as free the malloc'ed
memory.

IMHO it's pretty slick.  It makes it almost impossible for a driver to
leak resources, unless the owner process allocates stuff and then
lives forever.  But then you've got bigger problems.

If that's the sort of sanity mechanism you're proposing, I agree, it's
very useful.

-Scott

(*) I borrowed this bit of type checking from SWIG, which encodes its
pointers as something like (fuzzy memory) "_0x01234f30_FILE_": it
includes both the pointer address and the pointer type.




More information about the erlang-questions mailing list