[erlang-questions] Leader elections and quorum stuff

Mon Aug 10 19:29:21 CEST 2015

Where is the persistent state? If a node goes down, where do new nodes get their data from so they can continue the work? Is there an outside DB or is data persistence a part of this system?

Sergej

On 10/08/15 18:36, "Roger Lipscombe" <erlang-questions-bounces@REDACTED on behalf of roger@REDACTED> wrote:

>I've got a situation where I have a cluster of nodes.
>
>What's the current state of the art for deciding who decides whether a
>node is down? To rephrase: are there any good algorithms (or Erlang
>libraries) that decide which subset of nodes should monitor another
>(all other?) nodes? I don't want every node monitoring every node (or
>do I?)
>
>Also, once they've detected a failure, how to distribute the dead node's work?
>
>By work, each node is running a *large* number of different long-lived
>jobs. If one of the nodes dies, I need to distribute those jobs fairly
>across the other nodes in the cluster. A single job should not run in
>more than one place.
>
>Assume that every node knows about every other node's assigned work,
>either through some kind of gossip protocol, or through a shared
>store.
>
>I'm kinda assuming that the monitoring nodes will hold a quick
>election, so that there's only a single arbiter, but anything that
>shows how to do that without a single leader would be good too.
>
>Thanks,
>Roger.
>_______________________________________________
>erlang-questions mailing list
>erlang-questions@REDACTED
>http://erlang.org/mailman/listinfo/erlang-questions