[erlang-questions] User-level heartbeat (was mnesia bug)

Ulf Wiger ulf@REDACTED
Thu Mar 13 08:34:13 CET 2008


If it's following my advice, it's basically an administrative channel
apart from Distributed Erlang. We use an UDP port, and our
cluster controllers periodically send a status message to each
of the others. If you receive such a message from a node that's
not in your nodes() list (and you have connect_once semantics),
then you have a problem.

With this semantics, the channel doesn't really need to be fully
reliable; if a message is dropped, the indication will be delayed
until the next interval.

BR,
Ulf W

2008/3/13, Jay Nelson <jay@REDACTED>:
> Serge wrote:
>
>   > The only solution is to follow a variation of Uffe's advise (*) on
>   > recovering mnesia from a partitioned network - run the NodeC node
>  with
>   > {dist_auto_connect, once} and upon detecting a nodedown event on
>  NodeC
>   > enabling user-level heartbeat, and upon receiving a response from
>  NodeA
>   > or NodeB restart NodeC.
>
>  What exactly are you referring to when you say "user-level
>  heartbeat"?  Is this an application you rolled yourself for pinging
>  with UDP packets, or is there some option for turning on and off the
>  heartbeat process that comes with OTP?
>
>  jay
>
>  _______________________________________________
>  erlang-questions mailing list
>  erlang-questions@REDACTED
>  http://www.erlang.org/mailman/listinfo/erlang-questions
>



More information about the erlang-questions mailing list