[erlang-questions] [ANN] ActorDB a distributed SQL database (Sergej Jurecko)

Fri Jan 24 20:22:58 CET 2014

On 01/24, Sergej Jurecko wrote:
> Answers inline.
> 
> Global state is separate from transactions to actors. Actors are consistent. Change of configuration when initiated by user is eventually consistent.
> 

Does this mean there's a risk of configuration changes not making it out
properly, but nodes still connecting? i.e. this could potentially lead
to your nodes not seeing a majority where there should be one, or the
exact opposite?

In most of the papers I have read (and in systems like Riak), the
opposite is often true: you actually want strong consistency when
modifying the cluster, and eventual one is tolerable otherwise. This
seems like a funny (and risky) choice.

> 
> Yes you are right. We will rephrase.
> 

How do you ensure that quorum consensus works with ACID? Is it doing
majority reads? It would be important to describe it right.

> 
> Majority of nodes expected in the cluster. So a three node cluster is going to tolerate 1 missing node. Nodes can go missing for a period of time or they can shut down. Their missed writes will cause a restore operation for actors whose writes they did not execute. 
> 

I can see how that works. An interestign question is what happens if all
nodes go down, and need to be booted back up. Is there a way to reload
from incompletely booted nodes? I assume that by default you only allow
to copy from healthy nodes in the cluster, but if all of them became
unhealthy, there has to be a way to restore state back -- otherwise all
nodes will be waiting for all peers to come back up, and you're more or
less deadlocked.

> 
> When it comes to actors themselves there are no timeouts. If a node receives a nodedown message, all slave actors whose master seems to be gone are told to close. If a read/write request comes from a client, it will force a master election if there is none. 
> For global state it responds to a nodedown message. This is admittedly a part of the system which needs more testing. 
> 

That is a timeout. The question is how long does it take for a timeout
to be seen as a failure, more or less. You do not keep waiting forever,
and therefore you have timeouts.

I'm not sure what you mean by 'all slaves are told to close' -- would
that conflict with the idea that any of 3 nodes can fail? I'm going with
the assumption that the slave nodes stay up if they are to re-elect a
master.

A tricky issue there is that you may have asymetric netsplits. For
example, it's possible that your client can talk to the master, but that
the masters and slaves cannot communicate together. At that point you
have a master node that gets requests from the client, and a set of
slaves that elect a master among themselves. Depending on how things
work, it is possible that both masters may receive queries at the same
time.

When the network issue is resolved, you currently have 3 nodes, two of
which believe they are masters, one of the masters which may be lagging
behind one of the nodes it expects to be a slave. That kind of conflict
resolution should be considered.

> 
> Transaction is done from some server. If that server can reach all actors and those actors have a majority in their clusters it will succeed. If any of those actors do not have a majority transaction will fail. If transaction manager reached the point of committing transaction, but was no longer able to contact actors to tell them to commit, actors themselves will eventually call back to check if it is committed or not. The entire procedure is described in chapter 2.2.3 of documentation.
> 

Fair enough, although here the client is also part of the system. The
client will not know whether it failed or not in some cases, but I'm
guessing this is fair play for most relational databases, including
players like Postgres.

> 
> As a proxy. If the proxy or client died, transaction will either be committed or not. Like postgres.
> 

I think this is the same consideration as the previous point. I guess
the question/recommendation here is to be able to make a clear
distinction to the user of the database on whether the transaction
clearly succeeded, clearly failed, or that you actually do not know, and
to be careful not to report a timeout/failed proxy as a failure, in
which case one could believe the transaction is safe to retry.

> 
> Shard migration is done actor-by-actor. If it hits an actor after migration, request will be redirected to new cluster. If it hits during copy it depends on the phase of copy. Before sending the last packet all writes will be committed. Copying an actor locks it only once it has sent the entire db. After it has sent last packet it waits for confirmation that copy was successful. If it receives it, all requests that have been queued during lock are responded with a redirect. 
> 

Anything specific happens if the node it was transfering to dies? I'm
guessing here the lock isn't there indefinitely?

In which case, be careful because it could be one of them damn
netsplits, and you suddenly have duplicated shards that both believe
they are canonical!

Regards,
Fred.