[erlang-questions] is there an elephant in the room? mnesia network partition
Jason Ganetsky
jason.ganetsky@REDACTED
Sun Nov 2 23:59:38 CET 2008
I actually came to the same point as you did, a few months ago, and had
asked many people many questions. I also felt there was a lack of
discussion.
Our partitions were actually not caused by the network, but caused by a lack
of a responsiveness of one node. It had too little memory and would end up
swapping... while it was swapping, the second node declared the first
dead... the first node would come back with its connection to the second
broken, declaring the second dead.
To solve the problem, we needed automatic healing. Essentially, you will
have to come up with some mechanism to reconcile the differences between
partitioned databases. Fortunately, in my application, there are ways to
sensibly reconstruct the data from an external MySQL source. Also, you are
going to have to minimize the externalities that may cause a partition. This
means running both nodes on the same switch and making sure they are highly
available (possibly dedicating the machine to mnesia).
However, I ended up needing to read the Mnesia source to understand how it
detects partitions, and how it subsequently behaves. Using
set_master_nodes() has a number of undesirable traits... like it supresses
the running_partitioned_network message that is used to detect partitions. I
set it up so that both nodes in my pair would watch for partitions, both
would discard data with irreconcilable differences, and both would restart
Mnesia. For more details, feel free to e-mail me.
-Jason
On Sun, Nov 2, 2008 at 2:55 PM, Ulf Wiger <ulf@REDACTED> wrote:
> AFAIK, no general algorithm exists for self-healing after network
> splits. MySQL Cluster (NDB) e.g. solves it by requiring at least
> 3 copies of the data, and one arbitrator. In the case of a network
> split, you may continue if you can speak to the arbitrator; otherwise
> you're shut down.
>
> Mnesia provides the tools for resolving the situation, and one way
> to protect yourself from accidental inconsistencies is to use
> net_kernel dist_auto_connect_once, and keep a back door between
> the nodes (this has been discussed several times on this list.)
> Once you've determined that you have a split network, and which
> copies you want to continue with, you can restart the other nodes,
> possibly using mnesia:set_master_nodes/1 to make absolutely
> sure that they load their data from the right nodes.
>
> Setting this up is not terribly difficult. Interfacing to another DBMS
> is likely to be much more work, and you'd have to make really sure
> that they have a better strategy for coping with network splits than
> mnesia - I'm not at all sure that they do (but I'm willing to repent in
> the face of hard evidence).
>
> The lack of automatic handling of network splits has been mentioned
> a number of times as an argument against mnesia, but I really don't
> recall hearing much about how other DBMSs deal with it. There seems
> to be an assumption that since there isn't much discussion about
> network splits for other DBMSs, they must simply solve it transparently.
> I think this is a dangerous conclusion.
>
> BR,
> Ulf W
>
> 2008/11/2 Joel Reymont <joelr1@REDACTED>:
> > I'm looking to launch a poker 'social network', the first and only one
> > where you can actually play poker. I'm hesitant to go full-way with
> > Mnesia, though, and wonder how others are handling this.
> >
> > I googled and poked around but there seems to be an elephant in the
> > room and no one is talking about it. The elephant is that Mnesia does
> > not self-heal after network splits.
> >
> > Could it be that this is a solved problem or has anyone avoided it
> > because their data model does not require self-healing? How do big
> > projects deal with it? Ericsson?
> >
> > I would like to run a few Mnesia nodes for high availability but it
> > positively don't want my databases to diverge and I don't want to deal
> > with reconciling the databases later.
> >
> > Strictly speaking, I could keep mnesia as a transient data store and
> > keep my master database in a non-Erlang database. I just thought I'd
> > poll the community regardless.
> >
> > Thanks, Joel
> >
> > --
> > wagerlabs.com
> >
> > _______________________________________________
> > erlang-questions mailing list
> > erlang-questions@REDACTED
> > http://www.erlang.org/mailman/listinfo/erlang-questions
> >
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://www.erlang.org/mailman/listinfo/erlang-questions
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20081102/b657b5ca/attachment.htm>
More information about the erlang-questions
mailing list