[erlang-questions] User-level heartbeat (was mnesia bug)

Serge Aleynikov saleyn@REDACTED
Thu Mar 13 12:42:02 CET 2008


Jay Nelson wrote:
> Serge wrote:
> 
>  > The only solution is to follow a variation of Uffe's advise (*) on
>  > recovering mnesia from a partitioned network - run the NodeC node  
> with
>  > {dist_auto_connect, once} and upon detecting a nodedown event on  
> NodeC
>  > enabling user-level heartbeat, and upon receiving a response from  
> NodeA
>  > or NodeB restart NodeC.
> 
> What exactly are you referring to when you say "user-level  
> heartbeat"?  Is this an application you rolled yourself for pinging  
> with UDP packets, or is there some option for turning on and off the  
> heartbeat process that comes with OTP?

Unfortunately OTP doesn't have this feature as it looks like it was 
designed with reliable networks in mind.  So some time ago I put 
together a link monitoring app that during losses of connectivity does 
UDP/TCP pinging of nodes it's supposed to be connected to and upon 
seeing echoes from the peer nodes it calls some user-defined callbacks 
to determine the recovery action (e.g. what to do with mnesia instance, 
restart the node, reconnect, etc.).

Serge



More information about the erlang-questions mailing list