[erlang-questions] examples for erlang with joins

Tue Mar 10 02:55:23 CET 2009

On 10 Mar 2009, at 1:54 pm, Hubert Plociniczak wrote:
leaving me more confused than before.

There are all sorts of ways that one can approach something
like this.  Let me start by conceding that
   YES this increases the expressive power of Erlang receives and
   YES this increase may be very welcome for some problems and
   YES the Erlang canon is not closed, new ideas must not be
       rejected just because they are new (or because someone
       else thought of them) and
   YES the fact of an implementation with test cases creates a
       strong presumption in favour of the new feature.

What worries me is that it depends on an intricate and as yet
somewhat vaguely admubrated *procedural* semantics and that
it somehow seems to run contrary to some basic assumptions
about message passing in Erlang.

Where's Joe Armstrong when you need him?

Let me address the "basic assumptions" point.
Message passing in Erlang is modelled on message passing in
physical distributed systems.  This is necessary because at
least _some_ Erlang message passing really _is_ passing
messages over physical communication channels between physically
separated systems.  So
  - messages may fail to arrive at all
  - messages from different sources may arrive out of order
  - message transport times from different processes may
    differ by several orders of magnitude.

All of that makes something that relies on looking for messages
from more than one process somewhat problematic.

If process A sends a message to C before process B does,
the messages could arrive in the order (A,B) or the order (B,A).
It follows, does it not, that the matching process for
	receive P and Q ...
must be just as willing to match P=newer, Q=older as to match
P=older, Q=newer, and that such an "all ways around" match must
be completed before looking at the next clause.  So it's not just
because it would be too hard to understand that (receive P and Q...)
and (receive Q and P...) must have indistinguishable behaviour.

However, if process A sends messages X then Y to process C,
it -is- guaranteed that (if both arrive) X will arrive before Y.
The messages in a mail box are ordered, and this ordering is
consistent with the partial order of events.

This means that Erlang's "try each message in turn from oldest
to newest until one matches and then commit to that" does make
some kind of sense.  If A sent first X and then Y, and if either
X or Y would match, then Y cannot overtake X.

It appears to me that in the presence of join-receive, this is
no longer true.  Suppose we have
	oldest [Z, X, Y] newest
and we have
	receive P and Q -> ping ; R -> pong end
where P matches Z, Q matches Y, and R matches X.
In this example, the oldest message of all is indeed chosen,
but the *newer* message Y overtakes the *older* message X.

But if the message queue were
	oldest [X, Z, Y] newest
then X would have been chosen.  So whether Y overtakes X or not
depends on the relative order in which Z (from another process)
was received, and that's not at all well defined.

There's nothing much an implementation can do about this,
because messages in a mailbox are not tagged with the pid
of the originating process.

Sometimes we *want* a newer (urgent) message to overtake an
older (routine) message.  Priority messages are one of the
things that's often been asked for, and while we _can_ program
that, it's not as immediate as PRI ALT in Occam.  But even then,
we don't want it to _maybe_ overtake or _maybe_ not.

We already have people worried about the cost of scanning the
mailbox.  It's difficult to see how join-receive won't make
this worse, at least not without elaborate indexing machinery
like the RETE algorithm.  (Such machinery might be merited _anyway_.
That's another argument.)

I would like to hear a lot more about the use cases for this
proposal.