<div dir="ltr">I agree that using stateless app servers with a highly available database like Riak is probably a simpler and more robust design than trying to build that logic yourself. If your application can be built that way, it will be a lot easier.<div><br></div><div>-Andrew</div></div><div class="gmail_extra"><br><div class="gmail_quote">On Thu, Nov 20, 2014 at 3:53 PM, Michael Truog <span dir="ltr"><<a href="mailto:mjtruog@gmail.com" target="_blank">mjtruog@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div text="#000000" bgcolor="#FFFFFF"><span class="">
<div>On 11/20/2014 12:09 PM, Andrew Stone
wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">
<div>Hi,</div>
<div><br>
<div>I've waited for someone to jump in and say this, but it
hasn't happened. You really, really don't want to try
dealing with netsplits and application failover in an app
specific manner. It is not safe, and you will likely lose
data. You really need a consensus algorithm like raft or
paxos to handle this type of thing safely, or else you will
end up with conflicting data on both sides of the partition.</div>
<div><br>
</div>
<div>It may be quite a large dependency to rely on, but riak
2.0 has strongly consistent keys[1] that you could use to
build a lock server to point to the active master server.
Alternatively you could use riak_ensemble[2] directly to
build a custom solution.</div>
<div><br>
</div>
<div>Lastly, you can simply choose to not use failover and
accept that when the primary goes down you will be offline
until it comes back up. The secondary is just there to
provide disaster recovery in case the primary is
irrecoverable. This is a much safer and simpler solution,
and one historically used by conventional databases with
both asynchronous and synchronous replication. If you must
have some level of fault tolerance/ HA you can use paxos. If
your application can handle eventual consistency and the
data types fit the model, you could try to use CRDTs [3].
That would allow you 100% availability and even allow writes
to happen to both servers at once!</div>
<div><br>
</div>
<div>I can't stress enough how important it is to not build
ad-hoc failover protocols for this purpose. It will bite
you. I've been bitten before, and so have many other people
relying on this mechanism. While it may seem easier than
using a proper distributed systems protocol at first, when
you lose customer data in production, you quickly learn that
easy isn't best.</div>
<div><br>
</div>
<div>Best wishes,</div>
<div>Andrew</div>
<div><br>
</div>
<div>[1] <a href="http://docs.basho.com/riak/latest/dev/advanced/strong-consistency/" target="_blank">http://docs.basho.com/riak/latest/dev/advanced/strong-consistency/</a></div>
</div>
<div>[2] <a href="https://github.com/basho/riak_ensemble" target="_blank">https://github.com/basho/riak_ensemble</a></div>
<div>[3] <a href="https://en.wikipedia.org/wiki/Conflict-free_replicated_data_types" target="_blank">https://en.wikipedia.org/wiki/Conflict-free_replicated_data_types</a></div>
</div>
</blockquote></span>
The perspective above (Andrew's post) is good, but it is the
perspective that you must be trying to maintain global state
yourself. That means you are trying to create your own database
instead of reusing one of the many databases that already exist. I
believe development time is better spent reusing the databases that
already exist, to handle replication of state as necessary and to
deal with the latency inherent in that process.<br>
<br>
To pursue lower latency fault-tolerance it is better to have
master-less processing in the Erlang nodes (not quorum among all
instances with replicas for failures, but rather separate instances
of source code that are used separately, concurrently). Then all
the source code execution that needs to be fault-tolerant can be
replicated to separate nodes, so netsplits do not impact the
situation (any state the source code uses is only temporary, due to
relying completely on the database). If necessary, database nodes
could share the same machine as the Erlang node hosts, to avoid the
possibility that a switch failure could cause a netsplit which
impacts all database connections (assuming the database was one
which could handle all the failure scenarios).<br>
<br>
This is the approach you can take with pg2 usage
(<a href="http://www.erlang.org/doc/man/pg2.html" target="_blank">http://www.erlang.org/doc/man/pg2.html</a>) or cpg (at
<a href="https://github.com/okeuday/cpg/" target="_blank">https://github.com/okeuday/cpg/</a>) to create process groups that are
distributed. If you are looking for higher-level abstractions,
there is a service abstraction provided by CloudI
(<a href="http://cloudi.org" target="_blank">http://cloudi.org</a>) which relies on cpg to keep service processes
available on all Erlang nodes, despite any netsplits, pursuing this
master-less approach.<span class=""><br>
<br>
<blockquote type="cite">
<div class="gmail_extra"><br>
<div class="gmail_quote">On Thu, Nov 20, 2014 at 9:56 AM, Mark
Nijhof <span dir="ltr"><<a href="mailto:mark.nijhof@cre8ivethought.com" target="_blank">mark.nijhof@cre8ivethought.com</a>></span>
wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div dir="ltr">Thank you!</div>
<div class="gmail_extra"><br>
<div class="gmail_quote"><span>On Thu, Nov 20,
2014 at 3:51 PM, Imants Cekusins <span dir="ltr"><<a href="mailto:imantc@gmail.com" target="_blank">imantc@gmail.com</a>></span>
wrote:<br>
</span><span>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">the
code is in<br>
<br>
<a href="https://github.com/aminishiki/distr_netsplit.git" target="_blank">https://github.com/aminishiki/distr_netsplit.git</a><br>
<br>
any comments are welcome!<br>
</blockquote>
</span></div>
<br>
<br clear="all">
<span>
<div><br>
</div>
-- <br>
<div>
<div dir="ltr">Mark Nijhof<br>
<div>
<div>t: <a href="https://twitter.com/MarkNijhof" target="_blank">@MarkNijhof</a><br>
s: marknijhof</div>
</div>
<div><br>
</div>
</div>
</div>
</span></div>
<br>
_______________________________________________<br>
erlang-questions mailing list<br>
<a href="mailto:erlang-questions@erlang.org" target="_blank">erlang-questions@erlang.org</a><br>
<a href="http://erlang.org/mailman/listinfo/erlang-questions" target="_blank">http://erlang.org/mailman/listinfo/erlang-questions</a><br>
<br>
</blockquote>
</div>
<br>
</div>
<br>
<fieldset></fieldset>
<br>
<pre>_______________________________________________
erlang-questions mailing list
<a href="mailto:erlang-questions@erlang.org" target="_blank">erlang-questions@erlang.org</a>
<a href="http://erlang.org/mailman/listinfo/erlang-questions" target="_blank">http://erlang.org/mailman/listinfo/erlang-questions</a>
</pre>
</blockquote>
<br>
</span></div>
</blockquote></div><br></div>