[erlang-questions] Mnesia Fragmentation, duplicated records after rehashing

Tue Oct 25 12:23:49 CEST 2011

Hi, thank you for the response. I agree, Riak is a good option for us and we
were actually planning to try it in our test environment. But it is
practically impossible to switch the production cluster to a different
product so quickly, we choose mnesia fragmentation for its consolidated use.

We were not expecting hard issues in fundamental operations like fragments
management.

In the meanwhile, we tried to reproduce the duplicated frags issue and we
noticed the following scenario:

1. mnesia:change_table_frag(Table, {add_frag, [NewNode]}) foreach new node

2. consistency checks OK, no duplicated frags

3. restart mnesia, the nodes, and all the applications on the nodes

4. consistency checks FAIL, duplicated frags in the rehashing operation
source nodes

The involved fragments all have ram and disc copies, it seems that the
operations of removing old records after rehashing are not dumped to the
disc schema causing the duplicated records to appear in the next nodes
restarting.

Is there a way to force mnesia dumping all the operations to the disc
schema?

Thank you

Alexej

On 21 October 2011 22:45, Jon Watte <jwatte@REDACTED> wrote:

> This may be a suggestion that doesn't work for you, but if you need
> fragmentation (sharding) and adding/removing nodes in real time, have you
> looked at using a higher-level system like Riak?
>
> Sincerely,
>
> jw
>
>
> --
> Americans might object: there is no way we would sacrifice our living
> standards for the benefit of people in the rest of the world. Nevertheless,
> whether we get there willingly or not, we shall soon have lower consumption
> rates, because our present rates are unsustainable.
>
>
>
> On Tue, Oct 18, 2011 at 5:25 AM, TexTonPC <textonpc@REDACTED> wrote:
>
>> Hi,
>>
>> we are encountering a strange scenario using mnesia fragmentation in our
>> production system:
>> our cluster had around 20 tables spread over 8 mnesia nodes each running
>> on a single server, totalling 1024 frags per table (128 frags per node).
>>
>> Now we added 8 new machines to the cloud, and started the rehashing
>> process by adding other 128 frags per table on each new node.
>> I started this process from a different host in the cluster (lot of free
>> ram space) attached to the mnesia cluster calling
>> mnesia:change_table_frag(Table, {add_frag, [NewNode]}) for each table in
>> order to have 2048 frags per table spread over 16 nodes.
>>
>> 1. The adding_fragments process took a week to rehash all the table
>> records while working on a single core of this "maintenance" node. I read on
>> the mnesia docs and on this list that this kind of op locks the involved
>> table, but I was not able to parallelize on the different tables (parallel
>> processes each running add_frag on different table) in order to take
>> advantage of multiple cores. I had the feeling that add_frag "locks" the
>> entire mnesia transaction manager. Any perspectives or advice on this would
>> be greatly appreciated.
>>
>> 2. At the end of the frags-creation and rehashing process I noticed some
>> size unbalancing between old and new frags so I started a consistency
>> scanner that simply takes each record on each fragment and ensures that the
>> mnesia_frag hashing module actually maps that record on that specific
>> fragment. It turns out that the unbalanced frags have some records that were
>> moved to the new destination frag during the rehashing process, but were not
>> removed from the old source frag! I thought mnesia:change_table_frag(Table,
>> {add_frag, [NewNode]}) was running in atomic transaction context, has anyone
>> ever faced with something like this?
>>
>> Thank you
>>
>> --
>> textonpc@REDACTED
>> atessaro@REDACTED
>>
>> _______________________________________________
>> erlang-questions mailing list
>> erlang-questions@REDACTED
>> http://erlang.org/mailman/listinfo/erlang-questions
>>
>>
>

-- 
textonpc@REDACTED
atessaro@REDACTED
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20111025/a8504850/attachment.htm>