[erlang-questions] Mnesia transaction restarts

Bernard Duggan bduggan@REDACTED
Mon Oct 27 01:17:08 CET 2014


Thanks very much Dan. That was pretty much exactly my understanding, but it's very helpful have it confirmed. It really seems, in our case, like the starvation-avoidance system should have precluded such a high number of retries as we're seeing, particularly when the write is only on a single record. But since it *is* happening there must be some property of our system that I don't fully understand. Time to go digging again :)

Cheers,

Bernard

On 24/10/14 17:16, Dan Gudmundsson wrote:
Mnesia transactions are restarted when something hinders the completion of
the transaction. Basically locks that are occupied or that another involved mnesia-node goes down during the transaction.

So in your single node case it it caused by locks conflicts.

To avoid deadlock and starvation mnesia locks are restarted if
the transaction id is newer than the transaction holding the lock.
Otherwise the transaction will be queued until that lock is released.




/Dan

On Fri, Oct 24, 2014 at 6:08 AM, Bernard Duggan <bduggan@REDACTED<mailto:bduggan@REDACTED>> wrote:
Hi List,
Can anyone shed some light for me on the exact circumstances under which
mnesia transactions will restart? I ask because on our system (which I'd
describe as "moderately loaded", on the order of maybe a few million
transactions a day) we had a couple of cases recently where a relatively
simple read-then-write transaction hit 60 retries and 82 seconds. This
is a single-node system using a disc_copies table with a couple of extra
indices set on it, but nothing else very special.

Of course, mnesia did no less than what the manual promised: the
transaction eventually completed. However I'd like to understand better
what might lead to so many retries (and, as a result of the increasing
retry backoff, the very long delay). I've read every bit of "mneisa
internals" documentation I can get my hands on, but I'm still a bit
hazy. If I read that stuff correctly, it seems that unless there's a lot
of transactions that started *before* the one in question, and ran for a
very long time themselves, it should actually have been queued up to run
next rather than retrying. It seems very likely that there are other
cases that will cause a retry regardless of how "old" the transaction is
with respect to any it might conflict with (the manual casually mentions
that it may be restarted "thousands" of times), but it's those cases
that I'm not clear on.

Cheers,

Bernard

________________________________

This e-mail and any attachments are confidential. If it is not intended for you, please notify the sender, and please erase and ignore the contents.
_______________________________________________
erlang-questions mailing list
erlang-questions@REDACTED<mailto:erlang-questions@REDACTED>
http://erlang.org/mailman/listinfo/erlang-questions



________________________________

This e-mail and any attachments are confidential. If it is not intended for you, please notify the sender, and please erase and ignore the contents.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20141027/58729277/attachment.htm>


More information about the erlang-questions mailing list