[erlang-questions] question re. message delivery

Wed Sep 27 17:45:48 CEST 2017

On 9/26/17 10:58 PM, Richard A. O'Keefe wrote:
> (1) I told my concurrent programming class that
>     Erlang message delivery should be taken as
>     reliable up to the point where communication
>     is lost with the receiver, so that *IF* a
>     message is received, all previous messages
>     from that sender have been received in order.

That seems like an awfully big assumption, particularly when working 
with a distributed Erlang system.  Absent sequence numbers, there are 
lots of ways that messages can be lost invisibly to the sender and 
receiver.  (And what does it mean to "lose communication" when dealing 
with connectionless message delivery?)

>
> (2) I also told them that the big problem is
>     losing communication for a while and then
>     it comes back (e.g., someone accidentally
>     pulled a plug and then pushed it back in)
>     but that this is why TCP has sequence numbers
>     and acks.

Yes, but one also has failures where a TCP association fails and a new 
connection is established.  Stuff can get lost if a connection breaks 
and a new one is established.  Ordering, however, is maintained.  (It 
gets more interesting if you're trying to maintain something like an SSH 
session from a moving cell phone - you need something more than TCP to 
maintain continuity across changes in IP address that cause TCP 
associations to drop.)

>
> (3) I also told them that it is the nature of
>     the physical world that when you send someone
>     a message (texting on a mobile phone is a
>     great example) you can know that you SENT it
>     but you can never know they RECEIVED it
>     unless they tell you and gave the example of
>     my daughter wanting a ride home but my phone's
>     extremely limited mailbox filling up so I did
>     not get her message until hours later.
>
> (4) As for Joe's general philosophy of belief about
>     systems, I'm reminded of Dijkstra's distinction
>     between a Sufficiently Large Machine (one which
>     is able to run your program without exhausting
>     its resources) and a Hopefully Sufficiently
>     Large Machine (one which either does the job
>     properly or TELLS you it ran into trouble).
>     Having learned on a B6700 where the hardware
>     checked array subscripts and integer overflow
>     -- so that this was not something you could or
>     would consider turning off, there being no
>     cheaper way to do this -- and then meeting
>     the world of PDP-11s and DEC-10s, I quickly
>     learned the painful distinction between a
>     Hopefully Sufficiently Large Machine (B6700)
>     and an Insufficiently Large Machine (the others,
>     which just quietly went insane).
>
>     There are all sorts of properties we'd like
>     our systems to have, and they sort of
>     approximately do, most of the time, but we
>     really want to be TOLD when they're unable
>     to do their job properly.

Yes, indeed.
>
>     The Armstrong approach, after all, is not
>     "ignore errors", but "let it crash".

Yes, indeed.  But then one has to do something after the crash. :-)

>
> (5) I've just started looking at the MQTT protocol,
>     and noticed that you can ask for
>     "at least once", "at most once", or "exactly
>     once" delivery.  I suspect that this is another
>     area where it's "belief" not proof, and that
>     the end-to-end principle applies.

Haven't studied MQTT specifically, but where protocols are concerned, 
there are certainly ways to "prove" things about the specifications of a 
protocol.  Belief enters into implementation correctness and failure 
modes.  One can also support belief through testing & validation of 
implementations.  We do that all the time.

Cheers,

Miles

-- 
In theory, there is no difference between theory and practice.
In practice, there is.  .... Yogi Berra