[erlang-questions] is there an elephant in the room? mnesia network partition

Eli Liang <>
Sun Nov 2 22:42:53 CET 2008

Oracle deals with it very transparently starting with 10g via the Oracle Configuration Repository (OCR), and its Cluster Ready Services Daemon (CRSD). In particular, the Oracle Cluster Synchronization Service Daemon (OCSSD) sorts out any data corruption and gets data back into sync when the nodes are back in communications. All 3 of these services work together to handle automated healing after a network split. No administrator input is required.

--- On Sun, 11/2/08, Ulf Wiger <> wrote:

From: Ulf Wiger <>
Subject: Re: [erlang-questions] is there an elephant in the room? mnesia network partition
To: "Joel Reymont" <>
Cc: "Erlang Questions" <>
Date: Sunday, November 2, 2008, 2:55 PM

AFAIK, no general algorithm exists for self-healing after network
splits. MySQL Cluster (NDB) e.g. solves it by requiring at least
3 copies of the data, and one arbitrator. In the case of a network
split, you may continue if you can speak to the arbitrator; otherwise
you're shut down.

Mnesia provides the tools for resolving the situation, and one way
to protect yourself from accidental inconsistencies is to use
net_kernel dist_auto_connect_once, and keep a back door between
the nodes (this has been discussed several times on this list.)
Once you've determined that you have a split network, and which
copies you want to continue with, you can restart the other nodes,
possibly using mnesia:set_master_nodes/1 to make absolutely
sure that they load their data from the right nodes.

Setting this up is not terribly difficult. Interfacing to another DBMS
is likely to be much more work, and you'd have to make really sure
that they have a better strategy for coping with network splits than
mnesia - I'm not at all sure that they do (but I'm willing to repent in
the face of hard evidence).

The lack of automatic handling of network splits has been mentioned
a number of times as an argument against mnesia, but I really don't
recall hearing much about how other DBMSs deal with it. There seems
to be an assumption that since there isn't much discussion about
network splits for other DBMSs, they must simply solve it transparently.
I think this is a dangerous conclusion.

Ulf W

2008/11/2 Joel Reymont <>:
> I'm looking to launch a poker 'social network', the first and
only one
> where you can actually play poker. I'm hesitant to go full-way with
> Mnesia, though, and wonder how others are handling this.
> I googled and poked around but there seems to be an elephant in the
> room and no one is talking about it. The elephant is that Mnesia does
> not self-heal after network splits.
> Could it be that this is a solved problem or has anyone avoided it
> because their data model does not require self-healing? How do big
> projects deal with it? Ericsson?
> I would like to run a few Mnesia nodes for high availability but it
> positively don't want my databases to diverge and I don't want to
> with reconciling the databases later.
> Strictly speaking, I could keep mnesia as a transient data store and
> keep my master database in a non-Erlang database. I just thought I'd
> poll the community regardless.
>        Thanks, Joel
> --
> wagerlabs.com
> _______________________________________________
> erlang-questions mailing list
> http://www.erlang.org/mailman/listinfo/erlang-questions
erlang-questions mailing list

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20081102/099ed1f8/attachment.htm>

More information about the erlang-questions mailing list