[erlang-questions] question re. message delivery

Wed Sep 27 16:42:59 CEST 2017

I'll try to summarize what I know on the topic, acting a bit as a ghost
writer for the VM Team.

See also the old, outdated language specification, which is the best we have.
It is still the soother that the VM Team sucks on when they do not know
what else to do, and an updated version is in the pipeline.
See especially 10.6.2 Order of signals:

    http://erlang.org/download/erl_spec47.ps.gz

Also see, especially 2.1 Passing of Signals "signals can be lost":

    http://erlang.org/doc/apps/erts/communication.html

Message order between a pair of processes is guaranteed.  I.e. a message
sent after another message will not be received before that other message.

Messages may be dropped.  In particular due to a communication link
between nodes going down and up.

If you set a monitor (or a link) on a process you will get a 'DOWN'
message if the other process vanishes i.e. dies or communication link lost.
That 'DOWN' message is guaranteed to be delivered (the same applies to
links and 'EXIT' messages).

An example: if process P1 first sets a monitor on process P2 and then
sends messages M1, M2 and M3 to P2.  P2 acknowledges M1 with M1a and M3
with M3a.  Then if P1 gets M1a it knows that P2 has seen M1 and P1 is
guaranteed to eventually get either M3a or 'DOWN'.  If it gets M3a then
it knows P2 have seen M2 and M3.  If it gets 'DOWN' then M2 may have been
either dropped or seen by P2, the same applies to M3, and P1 may eventually
get M3a knowing that P2 has seen M3, but can not know if it has seen M2.

Another example: gen_server:call first sets a monitor on the server process,
then sends the query.  By that it knows it will eventually either get
the reply or 'DOWN'.  If it gets 'DOWN' it actually may get a late reply
(e.g. network down-up), which is often overlooked.

The distribution communication is is per default implemented with TCP links
between nodes.  The VM relies on the distribution transport to deliver
messages in order, or to die meaning that the link has failed and that any
number of messages at the end of the sequence may have been dropped.

Process links and monitors towards processes on remote nodes are registered
in the local node on the distribution channel entry for that remote node,
so the VM can trigger 'DOWN' and 'EXIT' messages for all links and monitors
when a communication link goes down.  These messages are guaranteed to be
delivered (if their owner exists).

I hope this clears things up.
/ Raimo Niskanen

On Tue, Sep 26, 2017 at 05:16:56PM -0700, Miles Fidelman wrote:
> Hi Joe,
> 
> Hmmm....
> 
> Joe Armstrong wrote:
> 
> > What I said was "message passing is assumed to be reliable"
> 
> >
> > The key word here is *assumed* my assumption is that if I open a TCP 
> > socket
> > and send it five messages numbered 1 to 5 then If I successfully read
> > message
> > 5 and have seen no error indicators then I can *assume* that messages 1 to
> > 4 also arrived in order.
> >
> 
> Well yes, but with TCP one has sequence numbers, buffering, and 
> retransmission - and GUARANTEES, by design, that if you (say a socket 
> connection) receive packet 5, then you've also received packets 1-4, in 
> order.
> 
> My understanding is that Erlang does NOT make that guarantee.  As stated:
> 
> - message delivery is assumed to be UNRELIABLE
> 
> - ordering is guaranteed to be maintained
> 
> The implication being that one might well receive packets 1, 2, 3, 5 - 
> and not know that 4 is missing.
> 
> > Actually I have no idea if this is true - but it does seem to be a
> > reasonable
> > assumption.
> >
> > Messages 1 to 4 might have arrived got put in a buffer prior to my reading
> > them and accidentally reordered due to a software bug. An alpha particle
> > might have hit the data in message 3 and changed it -- who knows?
> 
> 
> More likely, a TCP connection has dropped, taking a message or two with 
> it, and once the connection is re-established, stuff starts flowing 
> after a gap.
> 
> With UDP, packets could arrive out of order as well as get dropped.
> 
> There are ways to extend TCP, or write a higher level protocol that will 
> detect dropped connections, and packets, reconnect, request 
> retransmission - with the result that both the sender & receiver are 
> guaranteed both delivery & order.
> 
> Which brings us back to implementation.
> 
> >
> > Having assumed that message passing is reliable I build code based on
> > this assumption.
> 
> But, for Erlang, we can't make this assumption - the documentation 
> specifically says so.
> 
> >
> > I'm not, of course, saying that the assumption is true, just that I trust
> > the
> > implementers of the system have done a good job to try and make it true.
> > Certainly any repeatable counter examples should have been investigated
> > to see if there were any errors in the system.
> >
> > All this builds on layers of trust. I trust that erlang message passing is
> > ordered and reliable in the absence of errors.
> >
> > The Erlang implementers trust that TCP is reliable.
> 
> 
> Well, that is the question, isn't it.  Lots of things cause TCP to drop 
> connections.  So the question remains - how are dropped connections 
> handled?  And, if after a connection is dropped and restored, how are 
> dropped messages and/or messages received out of order handled?
> 
> Actually, there's another design question in there - in a multi-node 
> Erlang system, maintaining n2 TCP connections seems just a tad 
> unwieldy.  Personally, I'd be more likely to use a connectionless 
> protocol, maybe even broadcast.
> 
> 
> >
> > The TCP implementors trust that the OS is reliable.
> >
> > The OS implementors trust that the processor is reliable.
> >
> > The processor implementors trust that the VLSI compilers are correct.
> >
> > Software runs on physical machines - so really the laws of physics 
> > apply not
> > maths. Physics takes into account space and time, and the concept of
> > simultaneity does not exist, no so in maths.
> >
> > It seems to me that software is built upon chains of trust, not upon
> > mathematical chains of proof.
> >
> > I've just been saying "what we want to achieve" and not "how we can 
> > achieve
> > it".
> 
> Which brings us back to:
> 
> stated goals:  unreliable delivery, ordered delivery
> 
> The BEAM Book details how this works within a node, but is silent on how 
> distributed Erlang is implemented.  I'm really interested in some details.
> 
> >
> > The statements that people make about the system should be in terms
> > of belief rather than proof.
> >
> > I'd say "I believe we have reliable message passing"
> > It would be plain daft to say "we have reliable message passing" or
> > "we can prove it be correct" since there is no way of validating this.
> 
> Sure there is.  The state machine model of TCP is very clearly defined, 
> including its various error conditions.  And one can test an 
> implementation for adherence to the state machine model.  (In some 
> cases, one can also demonstrate that software is provably correct - but 
> let's not go there).
> 
> 
> >
> > Call me old fashioned but I think that claims that, for example,
> > "we have unlimited storage" and so on are just nuts ...
> 
> Agreed.  But claims like "when allocated storage reaches 80% use, 
> additional storage is allocated by <mechanism>" are not just reasonable, 
> but mandatory when designing systems that have to scale under uncertain 
> load.
> 
> Which brings us back to - how is message passing implemented between 
> Erlang nodes?
> 
> Cheers,
> 
> Miles
> 
> -- 
> In theory, there is no difference between theory and practice.
> In practice, there is.  .... Yogi Berra
> 

> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://erlang.org/mailman/listinfo/erlang-questions

-- 

/ Raimo Niskanen, Erlang/OTP, Ericsson AB