[erlang-questions] Mnesia transaction restarts

Thu Oct 30 11:56:15 CET 2014

In my experience you make these things happen yourself on the application
level.

Although the transaction you are describing is only accessing one record,
there might be another process that takes a whole table lock and keeps
running for some time. This is usually some reporting process or some
maintenance process, where you didn't realize that you were taking a whole
table lock. Such locks are implicitly taken by qlc:s, matches and selects
when you do them in a transaction.

If you don't have such processes, I would look for careless try/catch
patterns that execute in a transaction. Since the restarts in mnesia
transactions are signaled with exceptions you need to be very careful with
what you are catching in the code inside a transaction. This could make
another transaction that should have been restarted ending up catching the
restart signal and keep the lock.

Just my 2c

2014-10-28 5:57 GMT+01:00 Bernard Duggan <bduggan@REDACTED>:

>
> On 27/10/14 20:43, Ulf Wiger wrote:
>
>
>  On 27 Oct 2014, at 01:17, Bernard Duggan <bduggan@REDACTED> wrote:
>
> But since it *is* happening there must be some property of our system that
> I don't fully understand. Time to go digging again :)
>
>
>  If the writes are replicated, you might want to look for overload
> conditions on remote mnesia_tm processes (e.g. due to mnesia_overload
> (message_queue_len)).
>
> Thanks Ulf. Inconveniently, this is only a single-node system, so there's
> no opportunity for distribution systems to get in the way :)
>
>  Dan will have to correct me if I’m wrong here...
>
>  If a younger (WaitForTid < OurTid) transaction gets held up on commit,
> older transactions can get stuck in a restart condition. Since the
> mnesia_tm process relies on (unoptimizable) selective receive, it’s
> vulnerable to long message queues (which can happen especially if you have
> lots of replicated dirty writes).
>
> I have noticed (and been tripped up by) that selective receive in the
> past. I even took a look at one stage at whether it could be new-reference
> optimised, but quickly gave up. In at least one sense I'm glad to hear you
> say that it is, in fact, unoptimisable because it means I was right to not
> spend too much time on it :)
>
> [snip]
>
>  If you have side-effects inside transactions (message send/receive),
> basically all bets are off, and really weird things can happen, not least
> transaction processes waiting endlessly for messages that were already
> sent+received, but then discarded due to transaction restart.
>
> Thanks - it's something we've definitely tried to keep in mind, making
> sure we avoid debug logging etc inside the transactions. I will, however,
> go back and re-check a few critical places to make sure we're not doing
> something silly.
>
> Cheers,
>
> Bernard
>
>
> ------------------------------
>
> This e-mail and any attachments are confidential. If it is not intended
> for you, please notify the sender, and please erase and ignore the contents.
>
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://erlang.org/mailman/listinfo/erlang-questions
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20141030/33eadbe3/attachment.htm>