[erlang-questions] gen_leader discrepancies in reporting of downed nodes across a cluster

Jeremy Raymond <>
Thu Nov 29 17:24:49 CET 2012


 I gave that branch a try. I'm still seeing misreported downed nodes. I
should see correct gen_leader:down/1 and gen_leader:alive/1 lists on all
nodes correct?

--
Jeremy


On Tue, Nov 27, 2012 at 11:35 PM, Andrew Thompson <>wrote:

> On Tue, Nov 27, 2012 at 12:47:52PM -0500, Jeremy Raymond wrote:
> > Hi,
> >
> > I'm using the gen_leader behaviour from [1] in a 3 node Erlang cluster.
> I'm
> > running into a situation where if I down one of the nodes and bring it
> back
> > up, when it rejoins the cluster the other nodes still see it as being
> down
> > as reported by gen_leader:down/1. However the cycled node itself sees the
> > other two nodes as being up. If I cycle the other two nodes, then all
> three
> > will agree again on all of the nodes being available. This doesn't happen
> > all every time I down a node, but quite often. Another (related?) issue I
> > sometimes see is that gen_leader:down/1 sometimes reports the same node
> as
> > being down multiple times in the returned list.
> >
>
> Would you mind trying the branch at
>
> https://github.com/Vagabond/gen_leader_revival/tree/netsplit-tolerance
>
> This branch contains a bunch of work I did to work around these kinds
> of issues that Basho was seeing with gen_leader.
>
> Anfrew
> _______________________________________________
> erlang-questions mailing list
> 
> http://erlang.org/mailman/listinfo/erlang-questions
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20121129/2903ad51/attachment.html>


More information about the erlang-questions mailing list