[erlang-questions] Mnesia race condition with indexes and dirty read operations

Mon Mar 18 19:00:18 CET 2013

Hi all, I've come across some rather surprising behavior related to
Mnesia indexes and dirty reads, and I'm wondering if this is a known
issue and if this is something that could/should be fixed. This behavior
is limited to Mnesia tables of type 'set' with secondary indexes. I
would expect that if I simultaneously update a record and do a dirty
index read on it, then I will either read out the old version
(pre-update) or the new version (post-update). But in fact it seems
there is a short window of time during which I will see neither object
and my query will return no results, even if the query matches the
record both before and after the update.

Here's an illustration of what I'm talking about and how I'm able to
reproduce the problem:

1> rd(r, {a, b, c}).
r
2> mnesia:start().
ok
3> mnesia:create_table(r, [{attributes, record_info(fields, r)}]).
{atomic,ok}
4>  mnesia:add_table_index(r, b).
{atomic,ok}
5> [mnesia:dirty_write(#r{a=I, b=none, c=0}) || I <- lists:seq(0, 100000)].
[ok,ok,ok,ok,ok,ok,ok,ok,ok,ok,ok,ok,ok,ok,ok,ok,ok,ok,ok,
  ok,ok,ok,ok,ok,ok,ok,ok,ok,ok|...]
6> mnesia:dirty_write(#r{a=0, b=none, c=1}).
ok
7> mnesia:dirty_match_object(#r{b=none, c=1, _='_'}).
[#r{a = 0,b = none,c = 1}]
8> spawn(fun() -> mnesia:transaction(fun() -> mnesia:write(#r{a=0,
b=none, c=1}) end) end),
8> mnesia:dirty_match_object(#r{b=none, c=1, _='_'}).
[]
9> spawn(fun() -> mnesia:transaction(fun() -> mnesia:write(#r{a=0,
b=none, c=1}) end) end),
9> timer:sleep(100),
9> mnesia:dirty_match_object(#r{b=none, c=1, _='_'}).
[#r{a = 0,b = none,c = 1}]

Step 5 takes quite a while on my machine, but seems to be necessary to
reliably reproduce what I'm seeing. I believe this is due to the massive
slowdown that all the duplicate index values incur, which in turn allows
us to expose the underlying race condition more easily.

If I rerun step 8 repeatedly, I only get back an empty list about half
the time, seemingly at random. Adding the timer:sleep(100) seems to be
enough to get back the record I expect every time though. This behavior
is much more reliably reproducible if I use a dirty_write instead of a
transactional write, but I generally mistrust dirty writes for most
cases anyway, so I thought this would make a more relevant and
interesting example :-)

I realize dirty operations are warned against, but this seems
exceptionally counter-intuitive to me, even for a dirty op. From looking
at the mnesia_index.erl code it looks like there's a race condition
stemming from the add_index2 function where it first deletes any old
indexes for the given object and then regenerates them from scratch.
(And notably, this does not appear to be an issue for Mnesia tables of
type 'bag', only for 'set' tables.) Would this be considered a bug? It
seems like a bug to me, but if it's not, then I feel like maybe there
should at least be some kind of extra warning in the documentation for
the mnesia:dirty_index_* functions. Thoughts?

Thanks,
Nick

This e-mail and any attachments are confidential.  If it is not intended for you, please notify the sender, and please erase and ignore the contents.