[erlang-questions] Ideas for a new Erlang

Thu Jun 26 21:18:00 CEST 2008

Richard A. O'Keefe writes:
 > 
 > On 26 Jun 2008, at 9:57 am, Ulf Wiger wrote:
 > > Haskell has channels, and so does .NET (mailbox objects).
 > 
 > And Concurrent ML, amongst others, had them even earlier.
 > > There is
 > > therefore an opportunity to compare programs and try to determine
 > > whether programming with channels makes for more or less
 > > readable code than erlang's selective receive.
 > 
 > It seems to me that programming with Erlang's receive is
 > far simpler to do, to read, and to reason about than
 > "channels" (strictly speaking, mailboxes).  The reason
 > is not far to seek:
 > 
 >     if channels are "objects", they can go ANYWHERE.
 > 
 > There's another one.  Although Nystrom presented two
 > receive forms, the text made it clear that two more were
 > required.  But he had forgotten the need for something
 > like BSD select()/System V poll()/Ada selective accept/
 > other languages' multiwaits.  That means another two
 > forms.
 > 
 > 	nystrom_receive()	% default channel
 > 	nystrom_receive(Timeout)
 > 	nystrom_receive_from(Channel)
 > 	nystrom_receive_from(Channel, Timeout)
 > 	nystrom_receive_from_any(Channel_Set)
 > 	nystrom_receive_from_any(Channel_Set, Timeout)

True. But they are all simpler than selective receive.
 > 
 > But it gets worse.
 > 
 >      if channels are "objects", they can go anywhere,
 >      BUT THEY DON'T GET THERE WITHOUT BEING CARRIED!

In Erlang today, if one wants to keep track of communication with a
particular process one needs to carry the pid.
 > 
 > You have to explictly pass them around, especially loops.
 > So to handle the simple bounded buffer you would find
 > yourself writing stuff like this:
 > 
 > 	buffer(Status, Contents, GetChan, PutChan) ->
 > 	   Channels = case Status
 > 			of full  -> [GetChan]
 > 			 ; empty -> [PutChan]
 > 			 ; _     -> [GetChan,PutChan]
 > 		      end,
 > 	   case nystrom_receive_from_any(Channels)
 > 	     of {GetChan,Who} ->
 > 		{Status1,Contents1,Msg} = pop(Contents),
 > 		Who ! Msg
 > 	      ; {PutChan,Msg} ->
 > 		{Status1,Contents1} = add(Contents, Msg)
 > 	   end,
 > 	   buffer(Status1, Contents1, GetChan, PutChan).

I've commented on your bounded_buffer example in my previous mail. I'm
not sure what you are doing here. It might help if you could show the
example using selective receive.

 > 
 > That's assuming that multireceive returns a {Channel,Message}
 > pair.  Another approach would be to pass a list of {Channel,
 > Handler} pairs, when the code would look like
 > 
 > 	buffer(Status, Contents, GetChan, PutChan) ->
 > 	    GetHandler = fun (Who) ->
 > 		{Status1,Contents1,Msg} = pop(Contents),
 > 		Who ! Msg,
 > 		buffer(Status1, Contents1, GetChan, PutChan)
 > 	    end,
 > 	    PutHandler = fun (Msg) ->
 > 		{Status1,Contents1} = add(Contents, Msg),
 > 		buffer(Status1, Contents1, GetChan, PutChan)
 > 	    end,
 > 	    nystrom_receive_from_any(
 > 		case Status
 > 		  of full  -> [{GetChan,GetHandler}]
 > 		   ; empty -> [{PutChan,PutHandler}]
 > 		   ; _     -> [{GetChan,GetHandler},
 > 			       {PutChan,PutHandler}]
 > 		end).
 > 
 > How this is in any way simpler than a selective receive
 > entirely escapes me.

This example escapes me, too.

 > 
 > There's another problem with this.
 > I can easily fire up a bounded buffer process in Erlang:
 > 
 > 	Buffer = spawn (fun () ->
 > 	    buffer(empty, empty_buffer_contents())
 > 	end),
 > 
 > because spawn/1 delivers the Pid *outside* the process.
 > But new_channel/0 delivers its result *inside* the
 > process.  This isn't so in Concurrent ML, where mailboxes
 > will talk to anybody.  In CML, the parent process would
 > create the mailboxes GetChan and PutChan:
 > 
 > 	GetChan = Mailbox.mailbox(),
 > 	PutChan = Mailbox.mailbox(),
 > 	Buffer = spawn (fun () ->
 > 	    buffer(empty, empty_buffer_contents(),
 > 		   GetChan, PutChan)
 > 	end),
 > 
 > But in Nystrom's proposal, "only the creator of a channel may
 > receive messages from it" (p 5).  So the channels have to be
 > created *inside* the new buffer process.  How then do other
 > processes get their "hands" on them?
 > 
 > Ahah!  It's all so EASY without selective receive!
 > 
 > 	GetChanChan = new_channel(),
 > 	PutChanChan = new_channel(),
 > 	Buffer = spawn (fun () ->
 > 	    GetChan = new_channel(),
 > 	    PutChan = new_channel(),
 > 	    GetChanChan ! GetChan,
 > 	    PutChanChan ! PutChan,
 > 	    buffer(empty, empty_buffer_contents(),
 > 		   GetChan, PutChan)
 > 	end),
 > 	GetChan = ne_receive(GetChanChan),
 > 	PutChan = ne_receive(PutChanChan),
 > 	...

Let me just say that I find your bounded buffer example profoundly
unconvincing. The buffer may be bounded, but there is nothing to
prevent another process from filling the mailbox with "put" messages.

 > By the way, recall that the problem that this "simplification"
 > is supposed to solve is this: "it is considered bad style to
 > leave too many messages in the mailbox".  Let me quote from the
 > Concurrent ML documentation for the Mailbox structure:

Well, that was never the main problem. Selective receive is complex
and unnecessary. That was my main point. Programmers sometimes manage
to get into problems with it, but it was always my impression that
there were strategies to avoid these problems. I thought however, that
the fact that one had to make certain precautions to avoid getting
into trouble strengthened the argument against selective receive. 

 > 	Note that mailbox buffers are unbounded, which means that
 > 	there is no flow control to prevent a producer from greatly
 > 	outstripping a consumer, and thus exhausting memory.
 > 
 > If your mailbox is getting full, it's not another questionably
 > simpler language construct you need.  It's FLOW CONTROL, which
 > is a higher level protocol issue.

 > One thing I particularly like about 'receive' in Erlang is the
 > fact that it is visually hard to miss.  Things that look like
 > function calls are much easier to lose sight of in the thick
 > undergrowth of things that look like function calls.  In fact it
 > won't even do to just look for 'ne_receive' (or 'nystrom_receive*').
 > (It _is_ possible to hide 'receive', but it's a whole lot harder.)

That is true. One might want to look for something that is easier to
find in a block of code.

Sven-Olof