[erlang-questions] Split brain in disributed Erlang?
Sat Apr 28 13:20:41 CEST 2007
A few tricks that might help are:
- The OTP environment variable -kernel dist_auto_connect once
It makes sure that nodes can only connect automatically once;
if the connection is lost, one of the nodes must restart, or the
connection must be established manually. This way one has time
to diagnose the situation and take appropriate action.
- The global name server has a 'deconflict method'. The normal
action if it finds conflicting registrations when reconnecting with
another node, is to pick one of the registered processes at random,
and then kill it. Another option is to unregister all instances of
the given name. A third option is to provide your own function
to resolve the conflict.
Gen_leader is also designed to handle split brain. It has also been
verified to work using a battery of model checking, QuickCheck, etc.
2007/4/27, Tom Samplonius <tom@REDACTED>:
> How do you deal with split brain issues in distributed Erlang? In my
> case, I would like a single process that is running on a node, processing
> messages. If the node fails, start the process elsewhere. But if the
> "node" fails, is it down, or just unreachable? I don't want it be possible
> for be two nodes to be working on the same request. I assume that using 3+
> nodes and a quorum type system is the standard solution? Is there a library
> for managing this?
> Basically, if a node detects it is not part of the quorum (can't see a
> majority of the nodes), it should stop doing anything, until it can
> rejoin. And if the quorom master notices that a node has disappeared that
> was doing some sort of monitored process, it should restart that process on
> another node.
> erlang-questions mailing list
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the erlang-questions