<html><head></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space; ">Answers inline.<div><br><div><div>On Jan 24, 2014, at 5:33 PM, Fred Hebert wrote:</div><br class="Apple-interchange-newline"><blockquote type="cite"><div>I'm curious about a few things:<br><br>- You mention using ACID for transactions, but later mention "The master<br>  does periodic rebroadcasts of state. Eventually it will correct<br>  itself. But it had bad data for X seconds"<br><br>  This points towards an eventually consistent solution, not a fully<br>  consistent one.<br><br></div></blockquote><div><br></div>Global state is separate from transactions to actors. Actors are consistent. Change of configuration when initiated by user is eventually consistent.<br><br><blockquote type="cite"><div>- There is, during your leader election (picking local max) and many<br>  times around the text: "A successful 2 phase commit means a majority<br>  of nodes."<br></div></blockquote><blockquote type="cite"><div><font class="Apple-style-span" color="#000000"><br></font>  This sounds like a majority-based consensus, but the 2PC algorithm<br>  usually waits for *all* participants to have agreed. Unless you're<br>  adding majority-reads as a constraints, it sounds like you're going to<br>  be breaking ACID in the first place, and that you're not actually<br>  using 2PC, but a quorum-based consensus algorithm.<br><br></div></blockquote><div><br></div><div>Yes you are right. We will rephrase.</div><br><blockquote type="cite"><div>- It's unclear how 'majority' is determined. Is it a majority of all the<br>  nodes *expected* in the cluster, or a majority of the nodes<br>  *currently* in the cluster? How does this deal with netsplits?<br></div></blockquote><div><br></div><div>Majority of nodes expected in the cluster. So a three node cluster is going to tolerate 1 missing node. Nodes can go missing for a period of time or they can shut down. Their missed writes will cause a restore operation for actors whose writes they did not execute. </div><br><blockquote type="cite"><div><br>- No mention of timeout. Do you have a threshold under which a master<br>  gets de-elected for taking too long to respond? Is there an assumption<br>  here about timeouts vs. failures and how to tell them apart? Under<br>  such a scenario, how do two nodes who think of each other as masters<br>  detect the case and resolve it?<br><br></div></blockquote><div><br></div>When it comes to actors themselves there are no timeouts. If a node receives a nodedown message, all slave actors whose master seems to be gone are told to close. If a read/write request comes from a client, it will force a master election if there is none. </div><div>For global state it responds to a nodedown message. This is admittedly a part of the system which needs more testing. <br><br><blockquote type="cite"><div>- How do these mechanism keep working following a netsplit during a<br>  multi-shard transaction?<br></div></blockquote><div><br></div><div>Transaction is done from some server. If that server can reach all actors and those actors have a majority in their clusters it will succeed. If any of those actors do not have a majority transaction will fail. If transaction manager reached the point of committing transaction, but was no longer able to contact actors to tell them to commit, actors themselves will eventually call back to check if it is committed or not. The entire procedure is described in chapter 2.2.3 of documentation.</div><br><blockquote type="cite"><div><br>- When you redirect requests, are you doing the redirection as a proxy,<br>  or asking to retry directly? In the former case, what happens if the<br>  proxy node dies, but not the master actually doing the request? Or<br>  vice-versa, what if the client dies, but not the proxy node?<br><br></div></blockquote><div><br></div>As a proxy. If the proxy or client died, transaction will either be committed or not. Like postgres.<br><br><blockquote type="cite"><div>- What happens to requests being sent during a shard migration that<br>  hasn't yet completed?<br><br></div></blockquote><div><br></div><div>Shard migration is done actor-by-actor. If it hits an actor after migration, request will be redirected to new cluster. If it hits during copy it depends on the phase of copy. Before sending the last packet all writes will be committed. Copying an actor locks it only once it has sent the entire db. After it has sent last packet it waits for confirmation that copy was successful. If it receives it, all requests that have been queued during lock are responded with a redirect. </div><div><br></div><br><blockquote type="cite"><div>There's probably more to ask, but yeah. Distributed systems are fun and<br>hard!<br><br>Regards,<br>Fred.<br><br>On 01/24, Sergej Jurecko wrote:<br><blockquote type="cite">hello,<br></blockquote><blockquote type="cite"><br></blockquote><blockquote type="cite">We've put up a documentation page with more info. <br></blockquote><blockquote type="cite"><br></blockquote><blockquote type="cite"><a href="http://www.actordb.com/docs.html">http://www.actordb.com/docs.html</a><br></blockquote><blockquote type="cite"><br></blockquote><blockquote type="cite">Hopefully it answers more questions than it raises. If not fire away.<br></blockquote><blockquote type="cite"><br></blockquote><blockquote type="cite"><br></blockquote><blockquote type="cite">Sergej<br></blockquote><blockquote type="cite"><br></blockquote><blockquote type="cite">On Jan 22, 2014, at 10:19 PM, Valery Meleshkin wrote:<br></blockquote><blockquote type="cite"><br></blockquote><blockquote type="cite"><blockquote type="cite">Hi Sergej,<br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite">Which algorithms were used to build it? How its architecture looks like? How testing process looks like?<br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite">Specifically I’m interested in the details of replication, inter-actor transaction coordination, replica placement and membership service (e.g. raft/paxos/2pc/ 2pc over paxos ensembles/…).<br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite">-- <br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite">Sincerely,<br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite">(Mr.) Valery Meleshkin<br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite">_______________________________________________<br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite">erlang-questions mailing list<br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><a href="mailto:erlang-questions@erlang.org">erlang-questions@erlang.org</a><br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><a href="http://erlang.org/mailman/listinfo/erlang-questions">http://erlang.org/mailman/listinfo/erlang-questions</a><br></blockquote></blockquote><blockquote type="cite"><br></blockquote><br><blockquote type="cite">_______________________________________________<br></blockquote><blockquote type="cite">erlang-questions mailing list<br></blockquote><blockquote type="cite"><a href="mailto:erlang-questions@erlang.org">erlang-questions@erlang.org</a><br></blockquote><blockquote type="cite"><a href="http://erlang.org/mailman/listinfo/erlang-questions">http://erlang.org/mailman/listinfo/erlang-questions</a><br></blockquote><br></div></blockquote></div><br></div></body></html>