[erlang-questions] User-level heartbeat (was mnesia bug)
Ulf Wiger
ulf@REDACTED
Thu Mar 13 08:34:13 CET 2008
If it's following my advice, it's basically an administrative channel
apart from Distributed Erlang. We use an UDP port, and our
cluster controllers periodically send a status message to each
of the others. If you receive such a message from a node that's
not in your nodes() list (and you have connect_once semantics),
then you have a problem.
With this semantics, the channel doesn't really need to be fully
reliable; if a message is dropped, the indication will be delayed
until the next interval.
BR,
Ulf W
2008/3/13, Jay Nelson <jay@REDACTED>:
> Serge wrote:
>
> > The only solution is to follow a variation of Uffe's advise (*) on
> > recovering mnesia from a partitioned network - run the NodeC node
> with
> > {dist_auto_connect, once} and upon detecting a nodedown event on
> NodeC
> > enabling user-level heartbeat, and upon receiving a response from
> NodeA
> > or NodeB restart NodeC.
>
> What exactly are you referring to when you say "user-level
> heartbeat"? Is this an application you rolled yourself for pinging
> with UDP packets, or is there some option for turning on and off the
> heartbeat process that comes with OTP?
>
> jay
>
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://www.erlang.org/mailman/listinfo/erlang-questions
>
More information about the erlang-questions
mailing list