[erlang-questions] can nodes fail/recover too fast to be seen?

Fri Jul 5 22:07:05 CEST 2013

Tim Watson <watson.timothy@REDACTED> wrote:
>
>On 5 Jul 2013, at 20:23, Per Hedeland wrote:
>>>> On Jul 5, 2013, at 5:22 PM, Tim Watson wrote:
>>>> 
>>>>> As i understand it, this can and does happen, because erlang does automatic reconnect in order to provide reliable communications.
>> 
>> No.
>
>So is the Svensson and Frelund paper (viz [2] from my earlier post) incorrect in its assertion that messages between nodes can be dropped in the face of rapid node reconnects?

No, of course they can be dropped - if the destination node goes down,
it's impossible to know whether a given, sent message was a) received at
the remote *host*, b) received by the remote Erlang node, c) received by
the remote Erlang process, d) processed by the remote Erlang process, or
e) none of the above. But if you monitor/link (e.g. use
gen_server:call()), you will know that "badness happened", and can take
corrective action. "Re-sending only messages that need to be re-sent" is
not possible in general, and this is not specific to Erlang distribution.

See also http://www.erlang.org/faq/academic.html#id58000, which Matthias
Lang was kind enough to write up in a nice form based on some ramblings
of mine in the distant past. It could probably use s/link/monitor/, but
the general principle holds.

>>>>>> In Erlang, is it possible for a monitored node to fail and recover so quickly that nodes monitoring it won't detect the failure?
>> 
>> No. The TCP connection to the old node instance cannot be used for
>> communication with the new node instance, i.e. there is no way that
>> communication with the new node instance can be established without the
>> local VM generating node_down/'DOWN'/exit messages for the old instance.
>> 
>
>Just out of interest, is this enforced by epmd or internally?

epmd has no role in inter-node communication once the connection has
been established. TCP enforces "cannot be used for ...". The
VM/net_kernel will not make a new connection until it has decided that
the old one isn't working any more, and at that point it will generate
the node_down/'DOWN'/exit messages.

>>>>>> Or, is there some kind of internal persistent state that prevents this?
>> 
>> This is where it potentially gets interesting - i.e. assuming *no*
>> monitoring or linking - and that's where the "creation" part of a node
>> identifier comes into play. If a distributed node restarts, it will get
>> a new "creation" value courtesy of epmd, and any any pid() values
>> referring to the old node instance will be invalid.
>> 
>
>Does this depend on epmd having stayed up and running the whole time, or does epmd now have some local persistent state?

Good point - it depends on epmd having stayed up and running, i.e. if
the *host* reboots, there is a 25% possibility of the new node instance
getting the same "creation" value. However, see the FAQ again - if your
communication is critical, you can't depend on "creation" - it won't
tell you about failures anyway. It's just a way to try to prevent that
messages get delivered to the wrong process.

--Per