[erlang-questions] question re. message delivery

Raimo Niskanen <>
Thu Sep 28 14:40:37 CEST 2017


I have no objections.  You are absolutely right.
/ Raimo

On Thu, Sep 28, 2017 at 12:52:25PM +0200, Peer Stritzinger wrote:
> IIRC there is one guarantee in newer versions of Erlang (for a pretty conservative definition of new ;-)
> 
> All messages from another node arrive between its nodeup and nodedown message.
> 
> This means that you can always detect if there is a possible message loss between two nodes
> by monitoring the node.  
> 
> Or as I understand monitoring a remote process also does this implicitly since a link-down triggers 
> DOWN messages on all monitored processes across this link and a nodedown
> 
> Please correct me if this is not true (modulo the Hans Svensson/Lars-Åke Fredlund paper).
> 
> Cheers,
> -- Peer 
> 
> > On 27.09.2017, at 16:42, Raimo Niskanen <> wrote:
> > 
> > I'll try to summarize what I know on the topic, acting a bit as a ghost
> > writer for the VM Team.
> > 
> > See also the old, outdated language specification, which is the best we have.
> > It is still the soother that the VM Team sucks on when they do not know
> > what else to do, and an updated version is in the pipeline.
> > See especially 10.6.2 Order of signals:
> > 
> >    http://erlang.org/download/erl_spec47.ps.gz
> > 
> > 
> > Also see, especially 2.1 Passing of Signals "signals can be lost":
> > 
> >    http://erlang.org/doc/apps/erts/communication.html
> > 
> > 
> > Message order between a pair of processes is guaranteed.  I.e. a message
> > sent after another message will not be received before that other message.
> > 
> > Messages may be dropped.  In particular due to a communication link
> > between nodes going down and up.
> > 
> > If you set a monitor (or a link) on a process you will get a 'DOWN'
> > message if the other process vanishes i.e. dies or communication link lost.
> > That 'DOWN' message is guaranteed to be delivered (the same applies to
> > links and 'EXIT' messages).
> > 
> > An example: if process P1 first sets a monitor on process P2 and then
> > sends messages M1, M2 and M3 to P2.  P2 acknowledges M1 with M1a and M3
> > with M3a.  Then if P1 gets M1a it knows that P2 has seen M1 and P1 is
> > guaranteed to eventually get either M3a or 'DOWN'.  If it gets M3a then
> > it knows P2 have seen M2 and M3.  If it gets 'DOWN' then M2 may have been
> > either dropped or seen by P2, the same applies to M3, and P1 may eventually
> > get M3a knowing that P2 has seen M3, but can not know if it has seen M2.
> > 
> > Another example: gen_server:call first sets a monitor on the server process,
> > then sends the query.  By that it knows it will eventually either get
> > the reply or 'DOWN'.  If it gets 'DOWN' it actually may get a late reply
> > (e.g. network down-up), which is often overlooked.
> > 
> > The distribution communication is is per default implemented with TCP links
> > between nodes.  The VM relies on the distribution transport to deliver
> > messages in order, or to die meaning that the link has failed and that any
> > number of messages at the end of the sequence may have been dropped.
> > 
> > Process links and monitors towards processes on remote nodes are registered
> > in the local node on the distribution channel entry for that remote node,
> > so the VM can trigger 'DOWN' and 'EXIT' messages for all links and monitors
> > when a communication link goes down.  These messages are guaranteed to be
> > delivered (if their owner exists).
> > 
> > I hope this clears things up.
> > / Raimo Niskanen
> > 
> > 
> > 
> > On Tue, Sep 26, 2017 at 05:16:56PM -0700, Miles Fidelman wrote:
> >> Hi Joe,
> >> 
> >> Hmmm....
> >> 
> >> Joe Armstrong wrote:
> >> 
> >>> What I said was "message passing is assumed to be reliable"
> >> 
> >>> 
> >>> The key word here is *assumed* my assumption is that if I open a TCP 
> >>> socket
> >>> and send it five messages numbered 1 to 5 then If I successfully read
> >>> message
> >>> 5 and have seen no error indicators then I can *assume* that messages 1 to
> >>> 4 also arrived in order.
> >>> 
> >> 
> >> Well yes, but with TCP one has sequence numbers, buffering, and 
> >> retransmission - and GUARANTEES, by design, that if you (say a socket 
> >> connection) receive packet 5, then you've also received packets 1-4, in 
> >> order.
> >> 
> >> My understanding is that Erlang does NOT make that guarantee.  As stated:
> >> 
> >> - message delivery is assumed to be UNRELIABLE
> >> 
> >> - ordering is guaranteed to be maintained
> >> 
> >> The implication being that one might well receive packets 1, 2, 3, 5 - 
> >> and not know that 4 is missing.
> >> 
> >>> Actually I have no idea if this is true - but it does seem to be a
> >>> reasonable
> >>> assumption.
> >>> 
> >>> Messages 1 to 4 might have arrived got put in a buffer prior to my reading
> >>> them and accidentally reordered due to a software bug. An alpha particle
> >>> might have hit the data in message 3 and changed it -- who knows?
> >> 
> >> 
> >> More likely, a TCP connection has dropped, taking a message or two with 
> >> it, and once the connection is re-established, stuff starts flowing 
> >> after a gap.
> >> 
> >> With UDP, packets could arrive out of order as well as get dropped.
> >> 
> >> There are ways to extend TCP, or write a higher level protocol that will 
> >> detect dropped connections, and packets, reconnect, request 
> >> retransmission - with the result that both the sender & receiver are 
> >> guaranteed both delivery & order.
> >> 
> >> Which brings us back to implementation.
> >> 
> >>> 
> >>> Having assumed that message passing is reliable I build code based on
> >>> this assumption.
> >> 
> >> But, for Erlang, we can't make this assumption - the documentation 
> >> specifically says so.
> >> 
> >>> 
> >>> I'm not, of course, saying that the assumption is true, just that I trust
> >>> the
> >>> implementers of the system have done a good job to try and make it true.
> >>> Certainly any repeatable counter examples should have been investigated
> >>> to see if there were any errors in the system.
> >>> 
> >>> All this builds on layers of trust. I trust that erlang message passing is
> >>> ordered and reliable in the absence of errors.
> >>> 
> >>> The Erlang implementers trust that TCP is reliable.
> >> 
> >> 
> >> Well, that is the question, isn't it.  Lots of things cause TCP to drop 
> >> connections.  So the question remains - how are dropped connections 
> >> handled?  And, if after a connection is dropped and restored, how are 
> >> dropped messages and/or messages received out of order handled?
> >> 
> >> Actually, there's another design question in there - in a multi-node 
> >> Erlang system, maintaining n2 TCP connections seems just a tad 
> >> unwieldy.  Personally, I'd be more likely to use a connectionless 
> >> protocol, maybe even broadcast.
> >> 
> >> 
> >>> 
> >>> The TCP implementors trust that the OS is reliable.
> >>> 
> >>> The OS implementors trust that the processor is reliable.
> >>> 
> >>> The processor implementors trust that the VLSI compilers are correct.
> >>> 
> >>> Software runs on physical machines - so really the laws of physics 
> >>> apply not
> >>> maths. Physics takes into account space and time, and the concept of
> >>> simultaneity does not exist, no so in maths.
> >>> 
> >>> It seems to me that software is built upon chains of trust, not upon
> >>> mathematical chains of proof.
> >>> 
> >>> I've just been saying "what we want to achieve" and not "how we can 
> >>> achieve
> >>> it".
> >> 
> >> Which brings us back to:
> >> 
> >> stated goals:  unreliable delivery, ordered delivery
> >> 
> >> The BEAM Book details how this works within a node, but is silent on how 
> >> distributed Erlang is implemented.  I'm really interested in some details.
> >> 
> >>> 
> >>> The statements that people make about the system should be in terms
> >>> of belief rather than proof.
> >>> 
> >>> I'd say "I believe we have reliable message passing"
> >>> It would be plain daft to say "we have reliable message passing" or
> >>> "we can prove it be correct" since there is no way of validating this.
> >> 
> >> Sure there is.  The state machine model of TCP is very clearly defined, 
> >> including its various error conditions.  And one can test an 
> >> implementation for adherence to the state machine model.  (In some 
> >> cases, one can also demonstrate that software is provably correct - but 
> >> let's not go there).
> >> 
> >> 
> >>> 
> >>> Call me old fashioned but I think that claims that, for example,
> >>> "we have unlimited storage" and so on are just nuts ...
> >> 
> >> Agreed.  But claims like "when allocated storage reaches 80% use, 
> >> additional storage is allocated by <mechanism>" are not just reasonable, 
> >> but mandatory when designing systems that have to scale under uncertain 
> >> load.
> >> 
> >> Which brings us back to - how is message passing implemented between 
> >> Erlang nodes?
> >> 
> >> Cheers,
> >> 
> >> Miles
> >> 
> >> -- 
> >> In theory, there is no difference between theory and practice.
> >> In practice, there is.  .... Yogi Berra
> >> 
> > 
> >> _______________________________________________
> >> erlang-questions mailing list
> >> 
> >> http://erlang.org/mailman/listinfo/erlang-questions
> > 
> > 
> > -- 
> > 
> > / Raimo Niskanen, Erlang/OTP, Ericsson AB
> > _______________________________________________
> > erlang-questions mailing list
> > 
> > http://erlang.org/mailman/listinfo/erlang-questions
> 

-- 

/ Raimo Niskanen, Erlang/OTP, Ericsson AB


More information about the erlang-questions mailing list