[erlang-bugs] All keys in the same slot (Mnesia/dets)
Michael Truog
mjtruog@REDACTED
Tue Jun 7 10:01:11 CEST 2011
Normally hash tables use prime numbers for the number of slots to avoid collisions. Could this be part of the problem here?
On 06/07/2011 12:46 AM, Igor Ribeiro Sucupira wrote:
> Hum... mnesia_frag_hash and dets both use phash2, so it makes sense
> that the keys are poorly distributed among the slots of each dets
> table that is a disc_only_copies fragment. :-(
> I guess we'll have to deal with that somehow.
>
> However, I still don't understand why one fragment has 1536 slots and
> the other has 524288, in the example below.
>
> Thank you.
> Igor.
>
> On Tue, Jun 7, 2011 at 3:19 AM, Igor Ribeiro Sucupira <igorrs@REDACTED> wrote:
>> Hello.
>>
>> I was running a pool with Erlang/OTP R13B04 until the beginning of the
>> last week, when we upgraded to R14B02.
>>
>> The reason we upgraded is that everyday we were experiencing a lot of
>> corruption in Mnesia's fragments (disc_only_copies). We had Erlang
>> processes checking the fragments periodically and, in case of
>> problems, we would delete the fragment (mnesia:del_table_copy) and
>> clone it again from its replica.
>> Given that Erlang/OTP R14B01 fixed a lot of concurrency issues with
>> dets, I believe those bugs were affecting us because we perform a lot
>> of dirty reads, what must cause more concurrent operations in dets
>> tables.
>>
>> Anyway, since the upgrade to R14B02, no corrupted fragment has been
>> detected. :-)
>>
>> But then I observed that, after the upgrade, most of the servers are
>> performing much better (spending less time in I/O operations), while
>> some of them almost didn't change in that respect.
>>
>> Taking a look at Mnesia's directory in each server, I noticed that the
>> fragment files in one table (uc) are smaller in the servers that are
>> performing better.
>>
>> Example of a fragment in a "slower" server:
>> 59M uc_frag1004.DAT
>>
>> Example of a fragment in a "faster" server:
>> 34M uc_frag598.DAT
>>
>> Since uc is a bag, I thought it could be because uc_frag598, for
>> example, has less records than uc_frag1004. But I copied both to my
>> box and saw I was wrong:
>>
>> 1> {ok, F1004} = dets:open_file("uc_frag1004.DAT").
>> {ok,#Ref<0.0.0.33>}
>> 2> {ok, F598} = dets:open_file("uc_frag598.DAT").
>> {ok,#Ref<0.0.0.41>}
>> 3> dets:info(F1004, no_objects).
>> 280105
>> 4> dets:info(F598, no_objects).
>> 303074
>> 5> dets:info(F1004, no_keys).
>> 1404
>> 6> dets:info(F598, no_keys).
>> 1476
>>
>> The only (very) relevant difference I found between them was in the slots:
>>
>> 7> dets:info(F1004, no_slots).
>> {256,1536,2097152}
>> 8> dets:info(F598, no_slots).
>> {524288,524288,33554432}
>>
>> The weirdest difference being that all records in uc_frag1004 are in
>> the same slot!
>>
>> 9> length([S || S <- lists:seq(0, element(2, dets:info(F1004,
>> no_slots)) - 1), length(dets:slot(F1004, S)) > 0]).
>> 1
>> 10> length([S || S <- lists:seq(0, element(2, dets:info(F598,
>> no_slots)) - 1), length(dets:slot(F598, S)) > 0]).
>> 489
>>
>>
>> So... what may have happened and what can I do to fix it?
>>
>> Thank you.
>> Igor.
>>
>> --
>> "The secret of joy in work is contained in one word - excellence. To
>> know how to do something well is to enjoy it." - Pearl S. Buck.
> _______________________________________________
> erlang-bugs mailing list
> erlang-bugs@REDACTED
> http://erlang.org/mailman/listinfo/erlang-bugs
>
More information about the erlang-bugs
mailing list