[erlang-questions] Message order and exit(Reason)

Tue Jan 29 09:39:59 CET 2013

On 2013-01-29 09:00 , Ignas Vyšniauskas wrote:
> On 01/28/2013 10:34 PM, Rickard Green wrote:
>> The only signal ordering guarantee given is the following. If an
>> entity sends multiple signals to the same destination entity, the
>> order will be preserved. That is, if A send a signal S1 to B, and
>> later sends the signal S2 to B, S1 is guaranteed not to arrive after
>> S2.
>
> But this is true only for single-node semantics/settings, correct?
>
> At least I this paper[1]  by Koen Claessen and Hans Svensson claims that:
>
>> In Fredlund’s semantics the delivery of a message is instantaneous,
>> meaning that all messages are delivered in exactly the order they are
>> sent. Now, this is actually true for processes running on the same
>> node, due to how the Erlang runtime system is implemented. It is,
>> however, not true in general for a concurrency oriented programming
>> language, and specifically not in a distributed setting with several
>> different Erlang nodes.

These are different things. The delivery order of signals between two 
*specific* processes is always guaranteed, even in a distributed 
setting. Even if process A is on node 1 and process B is on node 2, it's 
easy to preserve the order. What Fredlund is talking about is the 
message order over all sends in the system. They do not model the time 
it may take in a distributed system for a message to be transferred from 
the sender to the mailbox of the receiver. Within a single node, it 
*used to* be the case that this operation was atomic. However, in a 
modern Erlang node on a multicore machine, there may be several sends 
being performed by different scheduler threads in true concurrent 
fashion, so the atomicity assumption is fundamentally broken.

> In general, they have a similar example to the one described in this
> thread where problems with the exit signal appear in Chapter 5 and conclude:
>
>> After Fredlund proposed his single-node semantics, it was at least
>> thought ”morally OK” to use this semantics to reason about and model
>> distributed systems. However, messages actually can and do
>> arrive in different orders in a distributed setting as compared to a
>> local setting. A simple experiment involving 3 nodes already shows
>> this, even when the nodes are implemented as 3 run-time systems
>> running on the same workstation! Moreover, we discovered that the
>> above was not merely a theoretical anomaly, but an actual problem
>> with a real-life implementation. Along the same lines, many Erlang
>> developers think it morally OK to test their distributed system on a
>> single node. For the same reasons as mentioned above, errors might
>> slip through.
>
> There's also a more recent one[2] by Hans Svensson.
>
> So I guess the moral is that it's not all so simple and nice when you
> "really" start distributing and it's not a good idea to make strong
> assumptions even about the tiniest details.

No, the ordering guarantees exist and can be relied on. You just can't 
assume a total order, even if that assumption makes your model easier to 
reason about. Your results won't be applicable on the real world.

    /Richard