[erlang-questions] Watching mnesia transactions

Tue Feb 9 11:11:25 CET 2010

Do keep in mind that if you implement a replication mechanism
on top of Ulf's RDBMS or Mnesia's subscriptions (with or without
Dannes extensions), you still need to handle potential recovery
of the last transactions before a node crash.

In case of a node crash, Mnesia's transaction recovery mechanism
will recover interrupted transactions. Some of the transactions will be
committed while others will be aborted. In your (poor mans)
replication mechanism, built on top of Mnesia, you need to handle
this so the tables that Mnesia handles and the tables that your code
handles always will be consistent.

/Håkan

On Tue, Feb 9, 2010 at 9:46 AM, Ulf Wiger
<ulf.wiger@REDACTED> wrote:
>
> The RDBMS contrib (jungerl) has some similar problems,
> and implemented commit/abort triggers with the help of
> the mnesia callback system. I ended up patching mnesia
> to try to come to grips with the JIT compilation of
> the data dictionary checks and reliable loading of the
> generated modules in a distributed setting as a result
> of _committed_ schema transactions. This was roughly
> the point where I called it a day and stopped working
> on it, as some of the robustness tests showed subtly
> different behaviour with and without JIT compilation.
>
> I think some of the constructs in rdbms.erl might at
> least get you further than you are today.
>
> BR,
> Ulf W
>
> Bernard Duggan wrote:
>>
>> Hi all,
>>    Bit of a complex mnesia question here.  First, I need to describe
>> what we want to do:
>>
>>    For reasons of interacting with a legacy system (which I'll call
>> "L") we want to be able to track changes to certain of our mnesia
>> tables, and send those changes off so L can update its own copy of the
>> data.  Sadly, it's thoroughly impractical to remove the local copy in L,
>> so we're pretty much stuck with this approach for now.  I'll explain the
>> approaches we've tried so far and the problems we've had - hopefully
>> someone can either suggest where our thinking has gone astray, or an
>> entirely different system :)
>>
>>
>> Attempt 1:
>> Using mnesia:subscribe({mnesia_table_event...}), we watch for table
>> events and send them off to L.  That's very nice and simple, except
>> there's no way we can see to group transaction events together - every
>> table event has its own ActivityID, certainly, but how do you know when
>> all the events for a given ActivityID have arrived?  This seems like a
>> pretty serious limitation on what otherwise is a very handy system.
>>
>>
>> Attempt 2:
>> Use the mnesia_access callback module and
>> mnesia:activity(transaction...).  Here, we've done the following:
>> * Prior to each mnesia:activity call, we set up an ETS table and throw
>> it in the process dictionary. * In the transaction function we first empty
>> the ETS table (to ensure
>> that if the transaction restarts, the table will be cleared)
>> * In the activity callback functions we record to the ETS table each
>> write/delete event that takes place on mnesia.
>> * When the transaction successfully completes, we can dump that table to
>> get a full record of write events, grouped as a transaction, and throw
>> those to L.
>>
>> We were really happy with that implementation until I realised a minor
>> problem with it this morning:
>> The posting of events to L happens /after/ the transaction completes.
>> Consider the case of two transactions, T1 and T2 in separate processes
>> which both modify the same row.  You could then have the case where:
>> * T1 runs
>> * T2 runs
>> * T2 sends its change to L
>> * T1 sends its older change to L
>> If, alternatively, we do the posting inside the transaction, even as the
>> last operation, it seems like it might be possible for the transaction
>> to restart after the post and get us in all kinds of trouble.
>>
>> We've come up with a number of elaborate modifications to this plan to
>> resolve the issue, but I can't help but think we're missing something
>> obvious (or non-obvious).  After all, mnesia knows full well what's gone
>> on and in what order, so why do we seem to need to go to such lengths to
>> reconstruct that information?
>>
>>
>> I've slightly simplified this to keep it relatively short, so I'll
>> apologise in advance if any of your questions are answered with "yeah we
>> already thought of that and with won't work/doesn't apply because blah
>> blah" :)
>>
>> Thanks for reading this far :)
>>
>> Cheers,
>>
>> Bernard
>
>
> --
> Ulf Wiger
> CTO, Erlang Solutions Ltd, formerly Erlang Training & Consulting Ltd
> http://www.erlang-solutions.com
> ---------------------------------------------------
>
> ---------------------------------------------------
>
> WE'VE CHANGED NAMES!
>
> Since January 1st 2010 Erlang Training and Consulting Ltd. has become ERLANG
> SOLUTIONS LTD.
>
> www.erlang-solutions.com