[erlang-bugs] All keys in the same slot (Mnesia/dets)
Igor Ribeiro Sucupira
igorrs@REDACTED
Tue Jun 7 08:19:36 CEST 2011
Hello.
I was running a pool with Erlang/OTP R13B04 until the beginning of the
last week, when we upgraded to R14B02.
The reason we upgraded is that everyday we were experiencing a lot of
corruption in Mnesia's fragments (disc_only_copies). We had Erlang
processes checking the fragments periodically and, in case of
problems, we would delete the fragment (mnesia:del_table_copy) and
clone it again from its replica.
Given that Erlang/OTP R14B01 fixed a lot of concurrency issues with
dets, I believe those bugs were affecting us because we perform a lot
of dirty reads, what must cause more concurrent operations in dets
tables.
Anyway, since the upgrade to R14B02, no corrupted fragment has been
detected. :-)
But then I observed that, after the upgrade, most of the servers are
performing much better (spending less time in I/O operations), while
some of them almost didn't change in that respect.
Taking a look at Mnesia's directory in each server, I noticed that the
fragment files in one table (uc) are smaller in the servers that are
performing better.
Example of a fragment in a "slower" server:
59M uc_frag1004.DAT
Example of a fragment in a "faster" server:
34M uc_frag598.DAT
Since uc is a bag, I thought it could be because uc_frag598, for
example, has less records than uc_frag1004. But I copied both to my
box and saw I was wrong:
1> {ok, F1004} = dets:open_file("uc_frag1004.DAT").
{ok,#Ref<0.0.0.33>}
2> {ok, F598} = dets:open_file("uc_frag598.DAT").
{ok,#Ref<0.0.0.41>}
3> dets:info(F1004, no_objects).
280105
4> dets:info(F598, no_objects).
303074
5> dets:info(F1004, no_keys).
1404
6> dets:info(F598, no_keys).
1476
The only (very) relevant difference I found between them was in the slots:
7> dets:info(F1004, no_slots).
{256,1536,2097152}
8> dets:info(F598, no_slots).
{524288,524288,33554432}
The weirdest difference being that all records in uc_frag1004 are in
the same slot!
9> length([S || S <- lists:seq(0, element(2, dets:info(F1004,
no_slots)) - 1), length(dets:slot(F1004, S)) > 0]).
1
10> length([S || S <- lists:seq(0, element(2, dets:info(F598,
no_slots)) - 1), length(dets:slot(F598, S)) > 0]).
489
So... what may have happened and what can I do to fix it?
Thank you.
Igor.
--
"The secret of joy in work is contained in one word - excellence. To
know how to do something well is to enjoy it." - Pearl S. Buck.
More information about the erlang-bugs
mailing list