[erlang-questions] RFC: mnesia majority checking

Morten Krogh mk@REDACTED
Thu Dec 9 20:38:16 CET 2010


Okay, but without Paxos or something similar, there will be some failure modes where the system becomes inconsistent. 

When do you roll back? After the commit? I was talking about a failure after the commit decision. Rollback after commit doesn't make sense??


Cheers,

Morten.
 



On Dec 9, 2010, at 7:47 PM, Ulf Wiger wrote:

> 
> On 9 Dec 2010, at 19:11, Morten Krogh wrote:
> 
>> Hi Ulf
>> 
>> Did you consider using the Paxos algorithm?
> 
> My intention for now was not to do any major surgery to
> the mnesia transaction handler, but rather extend the existing
> semantics with something useful.
> 
> So as a first step, I wanted to add the 'majority' option, since 
> I thought that would be a simple way to add quorum-style 
> safety and fencing in mnesia.
> 
>> How do you cope with node failure after the commit process has decided to commit but before the messages have arrived at the other nodes.
> 
> Actually, the asym_trans commit protocol in mnesia does
> this already. This protocol is used whenever the transaction
> contains schema updates or asymmetric replication patterns,
> It is more heavyweight than the 'sym_trans' protocol precisely
> because it deals with failures in the commit phase.
> 
> Specifically, the way it deals with failures in the commit phase 
> is that it rolls back the transaction.
> 
> BR,
> Ulf W
> 
> 
>> 
>> Morten.
>> 
>> 
>> On 12/9/10 6:25 PM, Ulf Wiger wrote:
>>> I added majority checking in the mnesia_locker as well.
>>> The main reason for doing so (except aborting earlier),
>>> was to enable majority checking on reads.
>>> 
>>> The way it works now is that majority checking is done on
>>> reads that use a write lock (e.g. mnesia:wread/1).
>>> A normal read, with a read lock, will succeed even in a
>>> minority. This is probably a pretty good thing.
>>> 
>>> https://github.com/uwiger/otp/commit/650f8e30d205bc1130f37c819f920f901358b937
>>> 
>>> Comments still most welcome. Monologues are fun too, but
>>> I can follow Dan North's advice and get a rubber duck for that.
>>> 
>>> If you are unsure whether this is at all needed, please chime in.
>>> It's is most definitely not a stupid question.
>>> 
>>> BR,
>>> Ulf W
>>> 
>>> On 9 Dec 2010, at 15:26, Ulf Wiger wrote:
>>> 
>>>> git fetch git://github.com/uwiger/otp mnesia-majority
>>>> 
>>>> https://github.com/uwiger/otp/commit/d97ae7d4329d9342e576f3cdd893de6865449e42
>>>> 
>>>> This is a first stab at a function that I believe could be useful in
>>>> high-availability applications using mnesia.
>>>> 
>>>> At this stage, I'd love to have some comments, and suggestions,
>>>> if someone thinks of a better way to do it.
>>>> 
>>>> From the commit message:
>>>> 
>>>> "Add {majority, boolean()} per-table option.
>>>> 
>>>> With {majority, true} set for a table, write transactions will
>>>> abort if they cannot commit to a majority of the nodes that
>>>> have a copy of the table. Currently, the implementation hooks
>>>> into the prepare_commit, and forces an asymmetric transaction
>>>> if the commit set affects any table with the majority flag set.
>>>> In the commit itself, the transaction will abort if it cannot
>>>> satisfy the majority requirement for all tables involved in the
>>>> thransaction.
>>>> 
>>>> A future optimization might be to abort already when a write
>>>> lock is attempted on such a table (/-object) and the lock cannot
>>>> be set on enough nodes.
>>>> 
>>>> This functionality introduces the possibility to automatically
>>>> "fence off" a table in the presence of failures.
>>>> 
>>>> This is a first implementation. Only basic tests have been
>>>> performed."
>>>> 
>>>> One particular use of this functionality would be to have  a "global
>>>> resource pool" in one table with {majority, true}, and periodically
>>>> check out resources into a local buffer. If there is a failure condition,
>>>> you can use the local buffer, but not check out more resources, unless
>>>> you happen to still be in contact with more than half of the replicas.
>>>> 
>>>> This should allow for a well-defined merge after a network split.
>>>> 
>>>> BR,
>>>> Ulf W
>>>> 
>>>> Ulf Wiger, CTO, Erlang Solutions, Ltd.
>>>> http://erlang-solutions.com
>>>> 
>>>> 
>>>> 
>>> Ulf Wiger, CTO, Erlang Solutions, Ltd.
>>> http://erlang-solutions.com
>>> 
>>> 
>>> 
>>> 
>>> ________________________________________________________________
>>> erlang-questions (at) erlang.org mailing list.
>>> See http://www.erlang.org/faq.html
>>> To unsubscribe; mailto:erlang-questions-unsubscribe@REDACTED
>>> 
>> 
>> 
>> ________________________________________________________________
>> erlang-questions (at) erlang.org mailing list.
>> See http://www.erlang.org/faq.html
>> To unsubscribe; mailto:erlang-questions-unsubscribe@REDACTED
>> 
> 
> Ulf Wiger, CTO, Erlang Solutions, Ltd.
> http://erlang-solutions.com
> 
> 
> 
> 
> ________________________________________________________________
> erlang-questions (at) erlang.org mailing list.
> See http://www.erlang.org/faq.html
> To unsubscribe; mailto:erlang-questions-unsubscribe@REDACTED
> 



More information about the erlang-questions mailing list