Redundancy, gen_leader and network splits
Ulf Wiger (AL/EAB)
ulf.wiger@REDACTED
Fri Jun 3 16:22:37 CEST 2005
Tim Bates wrote:
>
> Does the version of gen_leader in jungerl fix the
> bug that was discussed on this mailing list a while
> ago in the election process?
I'm not sure. The guys in Göteborg are working on it,
but I think that a new leader-election algorithm
basically has to be invented. ;-)
> Finally, I want to be absolutely sure that there
> won't be two nodes both thinking they're the leader
> at the same time, which I don't think gen_leader ensures.
It doesn't. It's of course a pathological case in
that there is no generic solution to the problem.
What one would like to have is a method to detect
it.
One thing that should work is to turn off auto-connect
on your nodes (this is done with the configuration
parameter -kernel dist_auto_connect always | never | once)
If you set it to e.g. 'once', two nodes will not
reconnect unless at least one of the nodes restarts
(in which case it is able to connect automatically
to the other). This way, if you suffer intermittent
loss of erlang communication, the network will stay
separated until you've figured out what to do.
You can detect the situation e.g. by letting the
gen_leader processes ping each other through a
"back door" (say, using UDP). If one leader gets a
backdoor ping from a node that's not in the nodes()
list, you have a problem, and need to decide who gets
to yield.
I don't think gen_leader has to be modified in order
to do this. You can do it on the side and reboot
the "minority leader", once identified.
/Uffe
More information about the erlang-questions
mailing list