some basic questions

Sat Oct 9 15:13:15 CEST 1999

Ulf Wiger wrote:
>
>On Fri, 8 Oct 1999, Vance Shipley wrote:
>
>vances>}  there aren't any gurantees for message delivery (no way to know 
>vances>}  if a message was consumed or not) 
>vances>}  -- is there any work being done on this?
>vances>
>vances>This has been one of my concerns I look forward to other's comments.
>
>Well, there is a new monitor() function in OTP R5B, which will also be in
>the next Open Source release (of course). It allows a sending process to
>monitor the receiving process; if the process is alive before you sent the
>message, and keeps on kicking afterwards, it got the message.

Hm, unless I'm missing something, monitor/2 is "just" a generalized (but
so far only applies to pids) and uni-directional version of link/1
(which is bi-directional) - plus monitor produces normal messages rather
than EXIT signals. And of course link/1 has been around since forever,
and has worked across node limits ever since nodes came into existence
as far as I know (it predates distributed Erlang).

>To clarify: Erlang does guarantee that the message is delivered *if* the
>receiver is alive, and there's a viable path between the sender and
>receiver (don't remember if the spec guarantees this, but the current
>implementation does, for all practical purposes.)

Yes, or to simplify it even further: "Delivery is guaranteed if nothing
breaks" - and if something breaks, you will find out provided you've
used link/1 (or monitor/2, hopefully). I.e. when using link, you will
get an EXIT signal not only if the linked process dies, but also if the
entire remote node crashes, or the network is broken, or if any of these
happen *before* you do the link.

It seems this issue of "guaranteed delivery" comes up every now and
then, but I've never managed to find out exactly what it is those that
are asking for it actually want:

- A guarantee that the message is put into the receiver's input queue?
  But if the receiver dies before extracting it from there, that
  guarantee is useless, as the message is lost anyway.

- A guarantee that the receiver extracts the message from its input
  queue? Well, besides the obvious problem that depending on how the
  receiver is written, even if it lives happily ever after it may
  *never* extract that particular message, it suffers from a variant of
  the previous problem: Even if you "know" that the receiver has
  "consumed" the message, it may die before acting on it in any way, and
  then again it may as well never have been sent.

- A guarantee that the receiver actually *processes* the message? Just
  kidding of course, hopefully it's obvious to everyone that the only
  way to obtain such a guarantee, regardless of what programming and
  communication system you use, is that the receiver is programmed to
  send an explicit acknowledgment when the processing is complete (of
  course this may be hidden below an abstraction such as RPC, but the
  fundamental principle holds).

Add to this that any guarantee would *have* to entail some form of ack
from the remote in at least a distributed system, even if it wasn't
directly visible to the programmer. E.g. you could have '!' block until
the ack comes back from the remote saying that the message had
progressed however far you required - i.e. synchronous communication of
sorts. But this would penalize those that *don't* require the
"guarantee" and *want* asynchronous communication, and while it's
trivial to implement synchronous communication on top of an asynchronous
mechanism, doing the opposite is a pain (typically calls for dedicated
"messenger processes" that do the sending for you).

So, depending on your requirements, Erlang offers you *at least* these
levels of "guarantee" (possibly substituting "monitor" for "link"):

Super-safe: Receiver sends ack after processing; sender links, sends,
waits for ack or EXIT => Sender knows, for each message, whether it was
fully processed or not. (The link only needs to be done once, of course).

Medium-safe: Receiver doesn't send acks; Sender links, sends message(s)
=> an EXIT signal informs the sender that some messages may never have
been processed.

Pretty-safe: Receiver doesn't send acks; sender sends messages.:-)

Plus of course any number of combinations of these (e.g. receiver sends
ack not after each message but at some critical points in the processing).
And don't forget that the *really* fault-tolerant OTP supervision
mechanisms were implemented with these "simple" "unsafe" primitives.:-)

--Per Hedeland
per@REDACTED

PS You might want to look for parallels in the somewhat successful world
of Internet protocols: IP ought to be a problem, since it doesn't
"guarantee delivery"; nevertheless it was possible to build the
"reliable" TCP on top of it. And if you think TCP "guarantees delivery"
(which most people probably do), then so does Erlang...:-)