<div dir="ltr">I agree that using stateless app servers with a highly available database like Riak is probably a simpler and more robust design than trying to build that logic yourself. If your application can be built that way, it will be a lot easier.<div><br></div><div>-Andrew</div></div><div class="gmail_extra"><br><div class="gmail_quote">On Thu, Nov 20, 2014 at 3:53 PM, Michael Truog <span dir="ltr"><<a href="mailto:mjtruog@gmail.com" target="_blank">mjtruog@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

  <div text="#000000" bgcolor="#FFFFFF"><span class="">

    <div>On 11/20/2014 12:09 PM, Andrew Stone

      wrote:<br>

    </div>

    <blockquote type="cite">

      <div dir="ltr">

        <div>Hi,</div>

        <div><br>

          <div>I've waited for someone to jump in and say this, but it

            hasn't happened. You really, really don't want to try

            dealing with netsplits and application failover in an app

            specific manner. It is not safe, and you will likely lose

            data. You really need a consensus algorithm like raft or

            paxos to handle this type of thing safely, or else you will

            end up with conflicting data on both sides of the partition.</div>

          <div><br>

          </div>

          <div>It may be quite a large dependency to rely on, but riak

            2.0 has strongly consistent keys[1] that you could use to

            build a lock server to point to the active master server.

            Alternatively you could use riak_ensemble[2] directly to

            build a custom solution.</div>

          <div><br>

          </div>

          <div>Lastly, you can simply choose to not use failover and

            accept that when the primary goes down you will be offline

            until it comes back up. The secondary is just there to

            provide disaster recovery in case the primary is

            irrecoverable. This is a much safer and simpler solution,

            and one historically used by conventional databases with

            both asynchronous and synchronous replication. If you must

            have some level of fault tolerance/ HA you can use paxos. If

            your application can handle eventual consistency and the

            data types fit the model, you could try to use CRDTs [3].

            That would allow you 100% availability and even allow writes

            to happen to both servers at once!</div>

          <div><br>

          </div>

          <div>I can't stress enough how important it is to not build

            ad-hoc failover protocols for this purpose. It will bite

            you. I've been bitten before, and so have many other people

            relying on this mechanism. While it may seem easier than

            using a proper distributed systems protocol at first, when

            you lose customer data in production, you quickly learn that

            easy isn't best.</div>

          <div><br>

          </div>

          <div>Best wishes,</div>

          <div>Andrew</div>

          <div><br>

          </div>

          <div>[1] <a href="http://docs.basho.com/riak/latest/dev/advanced/strong-consistency/" target="_blank">http://docs.basho.com/riak/latest/dev/advanced/strong-consistency/</a></div>

        </div>

        <div>[2] <a href="https://github.com/basho/riak_ensemble" target="_blank">https://github.com/basho/riak_ensemble</a></div>

        <div>[3] <a href="https://en.wikipedia.org/wiki/Conflict-free_replicated_data_types" target="_blank">https://en.wikipedia.org/wiki/Conflict-free_replicated_data_types</a></div>

      </div>

    </blockquote></span>

    The perspective above (Andrew's post) is good, but it is the

    perspective that you must be trying to maintain global state

    yourself.  That means you are trying to create your own database

    instead of reusing one of the many databases that already exist.  I

    believe development time is better spent reusing the databases that

    already exist, to handle replication of state as necessary and to

    deal with the latency inherent in that process.<br>

    <br>

    To pursue lower latency fault-tolerance it is better to have

    master-less processing in the Erlang nodes (not quorum among all

    instances with replicas for failures, but rather separate instances

    of source code that are used separately, concurrently).  Then all

    the source code execution that needs to be fault-tolerant can be

    replicated to separate nodes, so netsplits do not impact the

    situation (any state the source code uses is only temporary, due to

    relying completely on the database).  If necessary, database nodes

    could share the same machine as the Erlang node hosts, to avoid the

    possibility that a switch failure could cause a netsplit which

    impacts all database connections (assuming the database was one

    which could handle all the failure scenarios).<br>

    <br>

    This is the approach you can take with pg2 usage

    (<a href="http://www.erlang.org/doc/man/pg2.html" target="_blank">http://www.erlang.org/doc/man/pg2.html</a>) or cpg (at

    <a href="https://github.com/okeuday/cpg/" target="_blank">https://github.com/okeuday/cpg/</a>) to create process groups that are

    distributed.  If you are looking for higher-level abstractions,

    there is a service abstraction provided by CloudI

    (<a href="http://cloudi.org" target="_blank">http://cloudi.org</a>) which relies on cpg to keep service processes

    available on all Erlang nodes, despite any netsplits, pursuing this

    master-less approach.<span class=""><br>

    <br>

    <blockquote type="cite">

      <div class="gmail_extra"><br>

        <div class="gmail_quote">On Thu, Nov 20, 2014 at 9:56 AM, Mark

          Nijhof <span dir="ltr"><<a href="mailto:mark.nijhof@cre8ivethought.com" target="_blank">mark.nijhof@cre8ivethought.com</a>></span>

          wrote:<br>

          <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

            <div dir="ltr">Thank you!</div>

            <div class="gmail_extra"><br>

              <div class="gmail_quote"><span>On Thu, Nov 20,

                  2014 at 3:51 PM, Imants Cekusins <span dir="ltr"><<a href="mailto:imantc@gmail.com" target="_blank">imantc@gmail.com</a>></span>

                  wrote:<br>

                </span><span>

                  <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">the

                    code is in<br>

                    <br>

                    <a href="https://github.com/aminishiki/distr_netsplit.git" target="_blank">https://github.com/aminishiki/distr_netsplit.git</a><br>

                    <br>

                    any comments are welcome!<br>

                  </blockquote>

                </span></div>

              <br>

              <br clear="all">

              <span>

                <div><br>

                </div>

                -- <br>

                <div>

                  <div dir="ltr">Mark Nijhof<br>

                    <div>

                      <div>t:   <a href="https://twitter.com/MarkNijhof" target="_blank">@MarkNijhof</a><br>

                        s:  marknijhof</div>

                    </div>

                    <div><br>

                    </div>

                  </div>

                </div>

              </span></div>

            <br>

            _______________________________________________<br>

            erlang-questions mailing list<br>

            <a href="mailto:erlang-questions@erlang.org" target="_blank">erlang-questions@erlang.org</a><br>

            <a href="http://erlang.org/mailman/listinfo/erlang-questions" target="_blank">http://erlang.org/mailman/listinfo/erlang-questions</a><br>

            <br>

          </blockquote>

        </div>

        <br>

      </div>

      <br>

      <fieldset></fieldset>

      <br>

      <pre>_______________________________________________

erlang-questions mailing list

<a href="mailto:erlang-questions@erlang.org" target="_blank">erlang-questions@erlang.org</a>

<a href="http://erlang.org/mailman/listinfo/erlang-questions" target="_blank">http://erlang.org/mailman/listinfo/erlang-questions</a>

</pre>

    </blockquote>

    <br>

  </span></div>

</blockquote></div><br></div>