[erlang-questions] unsplit - resolving mnesia inconsistencies

Ulf Wiger ulf.wiger@REDACTED
Thu Feb 4 22:39:02 CET 2010


Andrew Thompson wrote:
> This is great. We've been doing this manually in not nearly so nice a
> fashion. How will this work with mnesia clusters of more than 2, like
> lets say a 3 node cluster where one node gets split off by a netsplit
> for a while - how do you avoid both of the other nodes trying to
> reconcile the split?

These are good questions. I guess the big question is how many
islands you expect to end up with in the worst case. In the
case you mention, there are still two islands. One of the
instances will enter the critical section (I guess the call
to global:trans/3 has no reason not to use all available
nodes) and address the split. The others should notice that
it's fixed once they enter the critical section.

But I appreciate all attempts to poke holes in the approach.
If we find a scenario that is not fixable, it is certainly
better to find out this way, than having your mission-critical
system go belly-up at the worst possible time. :)

There will always be pathological cases, of course. I've
seen dual-ethernet backbones become so fragmented that
the full mesh in an Erlang network started looking like
Swiss cheese. If we can handle at least the sane error
situations in a reliable way, I'll be fairly happy.

BR,
Ulf
-- 
Ulf Wiger
CTO, Erlang Solutions Ltd, formerly Erlang Training & Consulting Ltd
http://www.erlang-solutions.com


More information about the erlang-questions mailing list