Apparent race condition in dets can corrupt the file

John Hughes john.hughes@REDACTED
Thu Sep 23 23:22:48 CEST 2010


It looks as though QuickCheck has turned up another race condition in dets, 
this one involving three processes. The minimal example is:

...ensure that the file dets_table does not exist...
dets:open_file(dets_table,[{type,bag}]),
dets:close(dets_table),
dets:open_file(dets_table,[{type,bag}]),

In parallel:
    dets:lookup(dets_table,0)
    dets:insert(dets_table,{0,0})
    dets:insert(dets_table,{0,0})

dets:match_object(dets_table,'_')

All calls return the correct results until the call to match_object, which 
returns {error,{premature_eof,"dets_table"}}.

This looks as though it could be the same bug Tobbe Törnqvist mentioned 
here:
http://www.erlang.org/cgi-bin/ezmlm-cgi?4:msp:29771

"We know there is a lurking bug somewhere in the dets code. We have got 'bad 
object' and 'premature eof' every other month the last year. We have not 
been able to track the bug down since the dets files is repaired 
automatically next time it is opened."

In this case too, the file is left in a state that causes it to be repaired 
when it is reopened. After the repair, though, the data inserted into the 
table has been lost.

Caveats:

* The race condition occurs only rarely--I need to repeat the test case up 
to hundreds of times to hit it. This repetition may perhaps be necessary to 
provoke the failure.

* I'm using Windows... antivirus programs or even the file system itself 
might be involved.

* I can only provoke the bug using QuickCheck--coding up the dets calls in a 
simple Erlang function does not seem to provoke it. I speculate that this is 
because QuickCheck makes the calls a little more slowly... it's interpreting 
the test case after all... so the timing is different, and that may affect 
the likelihood of hitting the race. Of course, QuickCheck is not calling 
dets internally.

I'm attaching a copy of my test code, which tries the test up to 1000 times 
and, on my laptop, provokes the error virtually every time. You run it like 
this:

Erlang R14B (erts-5.8.1) [smp:2:2] [rq:2] [async-threads:0]

Eshell V5.8.1  (abort with ^G)
1> dets_eqc:bug().
Failed! Reason:
{'EXIT',{badarg,[{erlang,length,[{error,{premature_eof,"dets_table"}}]},
                 {dets_eqc,corrupted,2},
                 {dets_eqc,'-prop_parallel/0-fun-2-',3},
                 {eqc_gen,'-f388_0/1-fun-0-',3},
                 {eqc_gen,gen,3},
                 {eqc,'-f880_0/1-fun-2-',3},
                 {eqc_gen,'-f330_0/2-fun-1-',4},
                 {eqc_gen,f274_0,3}]}}
bag
1000
{[{init,{init_state,{state,undefined,bag,[],0}}},
  {set,{var,15},{call,dets_eqc,open_file,[dets_table,[{type,bag}]]}},
  {set,{var,20},{call,dets,close,[dets_table]}},
  {set,{var,21},{call,dets_eqc,open_file,[dets_table,[{type,bag}]]}}],
 [[{set,{var,22},{call,dets,lookup,[dets_table,0]}}],
  [{set,{var,24},{call,dets,insert,[dets_table,{0,0}]}}],
  [{set,{var,25},{call,dets,insert,[dets_table,{0,0}]}}]]}
History: 
[{{set,{var,15},{call,dets_eqc,open_file,[dets_table,[{type,bag}]]}},
           dets_table},
          {{set,{var,20},{call,dets,close,[dets_table]}},ok},
          {{set,{var,21},{call,dets_eqc,open_file,[dets_table,[{type,bag}]]}},
           dets_table}]
Parallel: [[{set,{var,22},{call,dets,lookup,[dets_table,0]},[]}],
           [{set,{var,24},{call,dets,insert,[dets_table,{0,0}]},ok}],
           [{set,{var,25},{call,dets,insert,[dets_table,{0,0}]},ok}]]

Res: ok
false
2>

I'm also attaching a copy of the dets file, just after the test has ended, 
in case that helps. I'm using QuickCheck version 1.22 to run the test, which 
can be downloaded from http://quviq-licencer.com/downloads/eqc-1.22.zip. 
(Note that earlier versions of QuickCheck can't run this example--it makes 
use of very recent additions). No licence is needed to run the bug() 
function above.

I'm running on an Intel L9400 (Core2Duo). Be interesting to know whether 
this example fails on other architectures and under other operating systems.

John 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: dets_eqc.erl
Type: application/octet-stream
Size: 7964 bytes
Desc: not available
URL: <http://erlang.org/pipermail/erlang-bugs/attachments/20100923/18945396/attachment.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: dets_table
Type: application/octet-stream
Size: 5432 bytes
Desc: not available
URL: <http://erlang.org/pipermail/erlang-bugs/attachments/20100923/18945396/attachment-0001.obj>


More information about the erlang-bugs mailing list