[erlang-questions] question re. message delivery

Thu Sep 28 12:52:25 CEST 2017

IIRC there is one guarantee in newer versions of Erlang (for a pretty conservative definition of new ;-)

All messages from another node arrive between its nodeup and nodedown message.

This means that you can always detect if there is a possible message loss between two nodes
by monitoring the node.  

Or as I understand monitoring a remote process also does this implicitly since a link-down triggers 
DOWN messages on all monitored processes across this link and a nodedown

Please correct me if this is not true (modulo the Hans Svensson/Lars-Åke Fredlund paper).

Cheers,
-- Peer 

> On 27.09.2017, at 16:42, Raimo Niskanen <raimo+erlang-questions@REDACTED> wrote:
> 
> I'll try to summarize what I know on the topic, acting a bit as a ghost
> writer for the VM Team.
> 
> See also the old, outdated language specification, which is the best we have.
> It is still the soother that the VM Team sucks on when they do not know
> what else to do, and an updated version is in the pipeline.
> See especially 10.6.2 Order of signals:
> 
>    http://erlang.org/download/erl_spec47.ps.gz
> 
> 
> Also see, especially 2.1 Passing of Signals "signals can be lost":
> 
>    http://erlang.org/doc/apps/erts/communication.html
> 
> 
> Message order between a pair of processes is guaranteed.  I.e. a message
> sent after another message will not be received before that other message.
> 
> Messages may be dropped.  In particular due to a communication link
> between nodes going down and up.
> 
> If you set a monitor (or a link) on a process you will get a 'DOWN'
> message if the other process vanishes i.e. dies or communication link lost.
> That 'DOWN' message is guaranteed to be delivered (the same applies to
> links and 'EXIT' messages).
> 
> An example: if process P1 first sets a monitor on process P2 and then
> sends messages M1, M2 and M3 to P2.  P2 acknowledges M1 with M1a and M3
> with M3a.  Then if P1 gets M1a it knows that P2 has seen M1 and P1 is
> guaranteed to eventually get either M3a or 'DOWN'.  If it gets M3a then
> it knows P2 have seen M2 and M3.  If it gets 'DOWN' then M2 may have been
> either dropped or seen by P2, the same applies to M3, and P1 may eventually
> get M3a knowing that P2 has seen M3, but can not know if it has seen M2.
> 
> Another example: gen_server:call first sets a monitor on the server process,
> then sends the query.  By that it knows it will eventually either get
> the reply or 'DOWN'.  If it gets 'DOWN' it actually may get a late reply
> (e.g. network down-up), which is often overlooked.
> 
> The distribution communication is is per default implemented with TCP links
> between nodes.  The VM relies on the distribution transport to deliver
> messages in order, or to die meaning that the link has failed and that any
> number of messages at the end of the sequence may have been dropped.
> 
> Process links and monitors towards processes on remote nodes are registered
> in the local node on the distribution channel entry for that remote node,
> so the VM can trigger 'DOWN' and 'EXIT' messages for all links and monitors
> when a communication link goes down.  These messages are guaranteed to be
> delivered (if their owner exists).
> 
> I hope this clears things up.
> / Raimo Niskanen
> 
> 
> 
> On Tue, Sep 26, 2017 at 05:16:56PM -0700, Miles Fidelman wrote:
>> Hi Joe,
>> 
>> Hmmm....
>> 
>> Joe Armstrong wrote:
>> 
>>> What I said was "message passing is assumed to be reliable"
>> 
>>> 
>>> The key word here is *assumed* my assumption is that if I open a TCP 
>>> socket
>>> and send it five messages numbered 1 to 5 then If I successfully read
>>> message
>>> 5 and have seen no error indicators then I can *assume* that messages 1 to
>>> 4 also arrived in order.
>>> 
>> 
>> Well yes, but with TCP one has sequence numbers, buffering, and 
>> retransmission - and GUARANTEES, by design, that if you (say a socket 
>> connection) receive packet 5, then you've also received packets 1-4, in 
>> order.
>> 
>> My understanding is that Erlang does NOT make that guarantee.  As stated:
>> 
>> - message delivery is assumed to be UNRELIABLE
>> 
>> - ordering is guaranteed to be maintained
>> 
>> The implication being that one might well receive packets 1, 2, 3, 5 - 
>> and not know that 4 is missing.
>> 
>>> Actually I have no idea if this is true - but it does seem to be a
>>> reasonable
>>> assumption.
>>> 
>>> Messages 1 to 4 might have arrived got put in a buffer prior to my reading
>>> them and accidentally reordered due to a software bug. An alpha particle
>>> might have hit the data in message 3 and changed it -- who knows?
>> 
>> 
>> More likely, a TCP connection has dropped, taking a message or two with 
>> it, and once the connection is re-established, stuff starts flowing 
>> after a gap.
>> 
>> With UDP, packets could arrive out of order as well as get dropped.
>> 
>> There are ways to extend TCP, or write a higher level protocol that will 
>> detect dropped connections, and packets, reconnect, request 
>> retransmission - with the result that both the sender & receiver are 
>> guaranteed both delivery & order.
>> 
>> Which brings us back to implementation.
>> 
>>> 
>>> Having assumed that message passing is reliable I build code based on
>>> this assumption.
>> 
>> But, for Erlang, we can't make this assumption - the documentation 
>> specifically says so.
>> 
>>> 
>>> I'm not, of course, saying that the assumption is true, just that I trust
>>> the
>>> implementers of the system have done a good job to try and make it true.
>>> Certainly any repeatable counter examples should have been investigated
>>> to see if there were any errors in the system.
>>> 
>>> All this builds on layers of trust. I trust that erlang message passing is
>>> ordered and reliable in the absence of errors.
>>> 
>>> The Erlang implementers trust that TCP is reliable.
>> 
>> 
>> Well, that is the question, isn't it.  Lots of things cause TCP to drop 
>> connections.  So the question remains - how are dropped connections 
>> handled?  And, if after a connection is dropped and restored, how are 
>> dropped messages and/or messages received out of order handled?
>> 
>> Actually, there's another design question in there - in a multi-node 
>> Erlang system, maintaining n2 TCP connections seems just a tad 
>> unwieldy.  Personally, I'd be more likely to use a connectionless 
>> protocol, maybe even broadcast.
>> 
>> 
>>> 
>>> The TCP implementors trust that the OS is reliable.
>>> 
>>> The OS implementors trust that the processor is reliable.
>>> 
>>> The processor implementors trust that the VLSI compilers are correct.
>>> 
>>> Software runs on physical machines - so really the laws of physics 
>>> apply not
>>> maths. Physics takes into account space and time, and the concept of
>>> simultaneity does not exist, no so in maths.
>>> 
>>> It seems to me that software is built upon chains of trust, not upon
>>> mathematical chains of proof.
>>> 
>>> I've just been saying "what we want to achieve" and not "how we can 
>>> achieve
>>> it".
>> 
>> Which brings us back to:
>> 
>> stated goals:  unreliable delivery, ordered delivery
>> 
>> The BEAM Book details how this works within a node, but is silent on how 
>> distributed Erlang is implemented.  I'm really interested in some details.
>> 
>>> 
>>> The statements that people make about the system should be in terms
>>> of belief rather than proof.
>>> 
>>> I'd say "I believe we have reliable message passing"
>>> It would be plain daft to say "we have reliable message passing" or
>>> "we can prove it be correct" since there is no way of validating this.
>> 
>> Sure there is.  The state machine model of TCP is very clearly defined, 
>> including its various error conditions.  And one can test an 
>> implementation for adherence to the state machine model.  (In some 
>> cases, one can also demonstrate that software is provably correct - but 
>> let's not go there).
>> 
>> 
>>> 
>>> Call me old fashioned but I think that claims that, for example,
>>> "we have unlimited storage" and so on are just nuts ...
>> 
>> Agreed.  But claims like "when allocated storage reaches 80% use, 
>> additional storage is allocated by <mechanism>" are not just reasonable, 
>> but mandatory when designing systems that have to scale under uncertain 
>> load.
>> 
>> Which brings us back to - how is message passing implemented between 
>> Erlang nodes?
>> 
>> Cheers,
>> 
>> Miles
>> 
>> -- 
>> In theory, there is no difference between theory and practice.
>> In practice, there is.  .... Yogi Berra
>> 
> 
>> _______________________________________________
>> erlang-questions mailing list
>> erlang-questions@REDACTED
>> http://erlang.org/mailman/listinfo/erlang-questions
> 
> 
> -- 
> 
> / Raimo Niskanen, Erlang/OTP, Ericsson AB
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://erlang.org/mailman/listinfo/erlang-questions