[erlang-questions] leader election ?

Fri Oct 19 18:34:31 CEST 2018

Hi Jachym

Thanks for taking the time to provide me with some hints, and you
are absolutely right. But unfortunately i do not have control of the feeds
or the downstream nodes, but the status of the downstream nodes should be
taken into consideration when deciding if my application should take the
role of active or passive.

Right now I am looking into a concept, where each application calculates a
"health score" (based on various parameters like connection to downstream
nodes, IP gateways etc).

Then they should then exchange health scores, and the one with highest
score is taking the role as active (if health score is identical, one of
them is configured to be preferred active).

Net split will still be an issue, but if it happens, it won't be a major
issue.

Thomas

On Fri, 19 Oct 2018 at 10:58 Jachym Holecek <freza@REDACTED> wrote:

> Hi Thomas,
>
> # Thomas Elsgaard 2018-10-19:
> > I am running 2 instances of an application ( seperate datacenters), both
> > receives indentical data on TCP sockets, but only one of the applications
> > should process it further downstream.
>
> What does "process it further downstream" entail? Is persitent state being
> updated on the "primary" node? Are further "downstream" services involved
> and if so can they participate in your scheme? Do you control the nodes
> originating the data feeds? What are the worst-case consequences of
> processing the feed at both nodes for some period of time? What are the
> consequences of not processing the feed at all for some period of time?
>
> > Any suggestions to what is the "simplest" way to have a kind of leader
> > election ? Both applications must be running, but must have knowledge
> about
> > their own role (active or passive) in order to decide if the data should
> be
> > processed downstream.
>
> From memory [*] solves this problem although I forgot the exact details of
> it.
>
> Depending on further details it might be that running periodic helthchecks
> between the two nodes and basing primary/standby roles on current liveness
> plus static priority suffices in practice (despite the obvious fragility).
>
> Or perhaps downstream nodes can advertise their perception of liveness of
> the two nodes to them and these then decide based on that.
>
> It may be that "primary/standby" isn't actually a property of those two
> nodes globally and that the decision is made for disjoint fragments of
> the objects involved depending on network conditions and downstream service
> status.
>
> It may be that the two processing nodes know whether they're currently
> eligible or not (depending on downstream service connectivity and liveness)
> and it may be that the data feed originator is best placed to instruct
> the processing nodes to act as primary / standby at the moment, depending
> on advertised eligibility.
>
> It's really hard to give a general answer. :-)
>
> > It just seems a little "heavy" to use consul or etcd to elect an leader
> > between two application instances.
>
> What if etcd tells you you're primary but whoops you don't have connection
> to etcd because somebody was playing with a firewall and cut you off for
> a few hours? Absent further details it is unclear whether involving an
> external arbiter even helps at all.
>
> BR,
>         -- Jachym
>
> [*]
> https://www.researchgate.net/profile/David_Powell9/publication/3832120_PADRE_a_Protocol_for_Asymmetric_Duplex_REdundancy/links/0912f50d062f67bcba000000/PADRE-a-Protocol-for-Asymmetric-Duplex-REdundancy.pdf
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20181019/bf50c057/attachment.htm>