[erlang-questions] Package Support/Use

Wed Nov 8 02:58:53 CET 2006

In response to
	> 	   {ok,FD} = open(Name),

I wrote:
	> Again, that is pretty unforgivable Erlang:  IF OPEN/1 CREATES
	> A RESOURCE WHICH WILL NOT BE AUTOMATICALLY FREED.

Ulf Wiger retorted:
	Incorrect. The file will be closed automatically if the
	process dies.

But how does that contradict my IF?

The only open/1 I can find in the on-line documents is
win32reg:open(OpenModeList).  The documentation is a little confusing;
one would expect the result to be a RegHandle, but we aren't told that,
and we are NOT told that the registry handle will be automatically
closed if the process dies.  Nor are we told whether sending the handle
to another process will give the other process access to the registry
or not.

There is epp:open(File_Name, Include_Path[, Predefined_Macros]),
which opens a file for preprocessing.  The epp manual page does NOT
say that the file will be automatically closed if the process dies.
Nor are we told whether sending the Epp result to another process
will give the other process the ability to read from the file or not.

There is also erl_tar:open(File_Name, Open_Mode_List), which opens
a tar file for reading or writing.  There is an internal link in the
erl_tar manual page (under close/1) to open/1, but no open/1 is described.
We are NOT told that the tar file will be automatically closed if the
process dies.  Nor are we told whether sending the descriptor to another
process will give the other process access to the tar file or not.

Curiously, the index in the on-line stdlib reference manual contains no
entry for file:open(_, _), which makes me wonder what else that index is
missing.  (Come to think of it, the file: module isn't in the table of
contents side bar.  Ohhh, it's in the KERNEL.  I must say that the
distinction between kernel and stdlib in the Erlang documentation is about
as helpful as the distinction between (2) and (3) in the UNIX manual is
to the average C programmer, which is to say that it is a pain in the neck,
to put it politely.  At the very least we need a unified function index.)

disk_log:open(Option_List) IS documented as automatically closing the log
when any owner process dies, but although we are explicitly told that the
log process is linked to the owner process, we are NOT told that it is
linked to user processes, and a reasonable inference from the disk_log:
manual page would be that failure of a user process to close the log will
result in the log staying open.

gen_udp:open(Port[, Options]) -> {ok,Socket} | ...
Since the port number becomes associated with the calling process, it would
make a great deal of sense for such a port to be automatically closed when
the calling process dies, but the documentation does NOT say so.  You can
change the controlling_process/2 of a UDP socket, which means that you can
force messages to be delivered to a process that is not expecting them.  We
can imagine situations where a supervisor creates a socket and passes control
of it to a worker, and if the worker dies, the supervisor creates another and
passes control of the socket to that, so it isn't entirely obvious that
automatic closing is always wanted.  Certainly no-one reading the documentation
would imagine that it ever happened.

wrap_log_reader:open(File_Name[, N]) -> {ok,Continuation} | ...
The documentation does NOT say that the log reader is closed automatically
when the calling process dies.  Nor does it say whether passing a continuation
to another process works.

file:open(File_Name, Options) -> {ok,IO_Device} | ...
Now this one *does* say
    IO_Device is really the pid of the process which handles the file.
    This process is linked to the process which originally opened the file.
    If any process to which the IO_Device is linked terminates, the file
    will be closed and the process itself will be terminated.

But this is not the function which was called in the example!

Now imagine the following situation:
    A user interface offers features like "Decompress file".
    When you select that option, it pops up a file selection dialogue
    so that you can choose the file to decompress, and another one so
    that you can choose the destination.  The user interface itself
    tries to open these file, so that it can report problems quickly
    and simply.  When the files are successfully opened, it spawns a
    new decompressor process, passing it the IO_Devices for the two
    files.

    The input is not in fact the result of compression, so the decompressor
    encounters what it thinks of as bad data, and dies.  But that wasn't
    the process that opened the files, so they are not automatically closed.
    The user interface code expected the decompressor to deal with
    everything, including closing the files, so it has long forgotten all
    about them.  LEAK!

    Clearly, there needs to be a file:controlling_process(IO_Device, PID)
    so that the user interface can do

	PID = spawn(decompressor, start Options),
	controlling_process(Source_Device, PID),
	controlling_process(Destination_Device, PID),
	PID!{decompress, Source_Device, Destination_Device, self()}

    Actually, that's not perfect.

It's amazing what you find (or don't find) when you look hard at the
documentation.  There are GENERAL issues for all

    X = <( create resource handle )>,
    ...
    <( close resource handle )>(X)

patterns.

(1) Will the resource be automatically closed if the calling
    process dies?

(2) If the handle is sent to another process, will the other process
    be able to use the resource?

(3) If the answers to (1) and (2) are "yes", is there a way to
    transfer ownership of a resource from the sender to the receiver?

These answers need to be given EXPLICITLy in the documentation of each
such resource.

To return to the original point, we were given no particular reason
to believe that open/1 was really the file:open/2 function, and since
most of the 'open' functions are NOT documented as auto-close, Ulf Wiger
has no grounds for his dogmatic "incorrect".

	> Verbose, yes, a little bit.  But it certainly doesn't
	> REQUIRE you to "bind worthless intermediate variables",
	> that's a straw man.

	But in fact, lots of erlang code out there actually looks
	like this (and I think it's fair to say that Mats has seen
	more Erlang code than most).

So what?  My point is that IT DOESN'T HAVE TO BE THAT WAY, and the
fact that it often _is_ that way really doesn't contradict or even
weaken my point at all.

Let me give you an very good analogy.  Last night I spent more hours
than I care to remember trying to read some of the R11B Erlang compiler,
and that was just the beam*.?rl files.  When I got my hands on the
Quintus compiler, it had just two comments, one of which was the
copyright notice.  The beam*.?rl files are better than that, but not
a LOT better.  Practically all the documentation I needed to understand
what I was reading existed.  If I had to maintain those files, it would
take me a couple of weeks to document them before I could start (as it
did take me a couple of weeks to document the QP compiler before I could
maintain it).  This is the commenting equivalent of the style Mats was
criticising.

BUT you cannot fault the Erlang syntax for this!  Banning comments because
the comments that _are_ there aren't as helpful as they should be would not
be a step forward.

	The style of programming Mats calls TradErl

This may be the source of confusion.  I thought he was calling the
*INTERFACE* (returning tagged values instead of exceptions) TradErl.
Ulf Wiger says he was calling the STYLE OF (AB)USING that interface
TradErl.

	The 1-minute google test gave me the gen_tcp man page:
	http://www.erlang.org/doc/doc-5.5.1/lib/kernel-2.11.1/doc/html/gen_tcp.html

	where example code looks like:

	server() ->
	     {ok, LSock} = gen_tcp:listen(5678, [binary, {packet, 0},
	                                         {active, false}]),
	     {ok, Sock} = gen_tcp:accept(LSock),
	     {ok, Bin} = do_recv(Sock, []),
	     ok = gen_tcp:close(Sock),
	     Bin.

Which would be better, while retaining the traditional Erlang *interfaces*,
as

    server() ->
	Sock = ok(gen_tcp:accept(
                  gen_tcp:listen(5678, [binary,{packet,0},{active,false}]))),
	Answer = do_recv(Sock, []),
	gen_tcp:close(Sock),
	ok(Answer).

By the way, do we have another documentation problem here?
The documentation for gen_tcp:listen(Port, Options) says

    "The returned socket Listen_Socket can only be used in calls to
     accept/1,2."

Does that mean that you can't call gen_tcp:close(LSocket)?
What DOES happen if you don't close a listen socket?

Let's see server/0 with careful closing, assuming for the sake of
argument that listen sockets SHOULD be closed.  Certainly in a UNIX
implementation, where you call socket(), then bind(), then listen(),
and that's your Listen_Socket, and then call accept() on that, which
is your gen_tcp:accept/1,2, the listen socket takes a file descriptor
slot and has to be closed like any other file descriptor.  So we'd have

    server() ->		% exception version
	Listen_Socket =
	    gen_tcp:listen(5678, [binary,{packet,0},{active,false}]),
	try
	    Accept_Socket = gen_tcp:accept(Listen_Socket),
	    try
	        do_recv(Accept_Socket, [])
	    after
	        gen_tcp:close(Accept_Socket)
	    end
	after
	    gen_tcp:close(Listen_Socket)
	end.

compared with

    server() ->		% traditional version
	case gen_tcp:listen(5678, [binary,{packet,0},{active,false}])
	 of  {ok,Listen_Socket} ->
	     case gen_tcp:accept(Listen_Socket)
	      of  {ok,Accept_Socket} ->
		  Answer = do_recv(Accept_Socket, []),
	          gen_tcp:close(Accept_Socket)
	       ;  E -> Answer = E
	     end,
	     gen_tcp:close(Listen_Socket)
          ;  E -> Answer = E
        end,
        Answer.

This is admittedly the kind of code I'd start writing combinators for
in Haskell.  But it is no longer than the exception-handling version and
introduces no variables that the exception-handling version does not have
to introduce.  

I DO NOT REGARD EITHER VERSION AS PLEASANT.  The exception-based version
is too much like Java for my taste, and worse, it separates the two
key operations (open and close) far too much.  Sadly, the Lisp
unwind-protect form does the same thing.  So let me illustrate what
I consider to be a better presentation with the aid of a macro:

    (define-macro (let-protected Bindings Cleanup . Body)
      `(let Bindings
         (unwind-protect
           (begin ,@ Body)
           ,Cleanup)))

    (define (server)
      (let-protected
	((Listen-Socket (gen-tcp:listen 5678 '(binary (packet 0) (active #f)))))
	(gen-tcp:close Listen-Socket)
	(let-protected
	  ((Accept-Socket (gen-tcp:accept Listen-Socket)))
	  (gen-tcp:close Accept-Socket)
	  (do-recv Accept-Socket '()))))

This puts the close right next to the open where you can SEE it and
habitually check that it is there.

	> The real issue is how exceptional the exceptional condition is and
	> whether it is likely that the caller can do anything in an error
	> case other than pass it on.

	Agreed.

	> A regular expression not matching is not exceptional.
	> A regular expression having bad syntax IS exceptional.
	> Reaching the end of a file *between* characters is not exceptional.
	> Reaching the end of a file *within* the UTF-8 byte sequence for a
	> character IS exceptional.

	Agreed.

	>     - if the read fails, is there anything the caller can do about it?
	> 	A negative result from read(2) counts as exceptional; there is
	> 	probably nothing the program can do about it (except when errno
	> 	is EAGAIN, of course).
	> 	There is something the caller MUST do, and that is close the file.

	In C, but not in Erlang.

We are back to the issue I discussed at length in the earlier part of
this message.  There is NO reason to believe that the process calling
read is the same as the process that opened the file, so there is NO
reason to believe that the death of THIS process will result in the file
being automatically closed.

	> What does the code *really* look like in these two cases?
	>
	>       % tagged results
	>
	> 	Channel = ok(open(Name)),
	> 	Outcome = read(Channel),
	> 	close(Channel),
	> 	case Outcome
	> 	 of  {ok,Data} -> ok(process_data(Data))
	> 	  ;  _         -> ok(Outcome)
	> 	end
	>
	>       % exceptions
	>
	> 	Channel = open(Name),
	> 	try
	> 	    Data = read(Channel),
	> 	after
	> 	    close(Channel)
	> 	end,
	> 	process_data(Data)
	>
	> Funny how the difference becomes less clear-cut when you make the
	> example more realistic, isn't it?

	But in the "tagged results" example above, you don't
	close the Channel if read/1 crashes, right?

That's because read/1 is written in the tagged results style and
DOESN'T crash.  If you can make unfounded assumptions about an undocumented
and probably non-existent open/1, I can make them about an undocumented and
probably non-existent read/1.  Fair's fair.

Is there any prospect of fixing try..after..end to have its parts in the
right order?  Actually, broadening the scope of the 'let' keyword I've
hinted at before,

    let V1 = E1, ...
    after Cleanup
    in Body
    end

would give us

    server() ->
	let Listen_Socket = gen_tcp:listen(5789,
			                   [binary,{packet,0},{active,false}])
	after gen_tcp:close(Listen_Socket)
        in  let Accept_Socket = gen_tcp:accept(Listen_Socket)
            after gen_tcp:close(Accept_Socket)
            in do_rev(Accept_Socket, [])
            end
	end.

which is quite pretty, because all of the references to a resource are
grouped together in a _single_ form.