[erlang-questions] User-level heartbeat (was mnesia bug)
Serge Aleynikov
saleyn@REDACTED
Thu Mar 13 12:42:02 CET 2008
Jay Nelson wrote:
> Serge wrote:
>
> > The only solution is to follow a variation of Uffe's advise (*) on
> > recovering mnesia from a partitioned network - run the NodeC node
> with
> > {dist_auto_connect, once} and upon detecting a nodedown event on
> NodeC
> > enabling user-level heartbeat, and upon receiving a response from
> NodeA
> > or NodeB restart NodeC.
>
> What exactly are you referring to when you say "user-level
> heartbeat"? Is this an application you rolled yourself for pinging
> with UDP packets, or is there some option for turning on and off the
> heartbeat process that comes with OTP?
Unfortunately OTP doesn't have this feature as it looks like it was
designed with reliable networks in mind. So some time ago I put
together a link monitoring app that during losses of connectivity does
UDP/TCP pinging of nodes it's supposed to be connected to and upon
seeing echoes from the peer nodes it calls some user-defined callbacks
to determine the recovery action (e.g. what to do with mnesia instance,
restart the node, reconnect, etc.).
Serge
More information about the erlang-questions
mailing list