[erlang-questions] Mnesia Fragmentation, duplicated records after rehashing

Tue Oct 25 14:04:12 CEST 2011

I would appreciate if you could email me a minimal test that shows the problem.

/Dan

On Tue, Oct 25, 2011 at 12:23 PM, TexTonPC <textonpc@REDACTED> wrote:
> Hi, thank you for the response. I agree, Riak is a good option for us and we
> were actually planning to try it in our test environment. But it is
> practically impossible to switch the production cluster to a different
> product so quickly, we choose mnesia fragmentation for its consolidated use.
>
> We were not expecting hard issues in fundamental operations like fragments
> management.
>
> In the meanwhile, we tried to reproduce the duplicated frags issue and we
> noticed the following scenario:
>
> 1. mnesia:change_table_frag(Table, {add_frag, [NewNode]}) foreach new node
>
> 2. consistency checks OK, no duplicated frags
>
> 3. restart mnesia, the nodes, and all the applications on the nodes
>
> 4. consistency checks FAIL, duplicated frags in the rehashing operation
> source nodes
>
> The involved fragments all have ram and disc copies, it seems that the
> operations of removing old records after rehashing are not dumped to the
> disc schema causing the duplicated records to appear in the next nodes
> restarting.
>
> Is there a way to force mnesia dumping all the operations to the disc
> schema?
>
> Thank you
>
> Alexej
>
> On 21 October 2011 22:45, Jon Watte <jwatte@REDACTED> wrote:
>>
>> This may be a suggestion that doesn't work for you, but if you need
>> fragmentation (sharding) and adding/removing nodes in real time, have you
>> looked at using a higher-level system like Riak?
>> Sincerely,
>> jw
>>
>> --
>> Americans might object: there is no way we would sacrifice our living
>> standards for the benefit of people in the rest of the world. Nevertheless,
>> whether we get there willingly or not, we shall soon have lower consumption
>> rates, because our present rates are unsustainable.
>>
>>
>>
>> On Tue, Oct 18, 2011 at 5:25 AM, TexTonPC <textonpc@REDACTED> wrote:
>>>
>>> Hi,
>>> we are encountering a strange scenario using mnesia fragmentation in our
>>> production system:
>>> our cluster had around 20 tables spread over 8 mnesia nodes each running
>>> on a single server, totalling 1024 frags per table (128 frags per node).
>>> Now we added 8 new machines to the cloud, and started the rehashing
>>> process by adding other 128 frags per table on each new node.
>>> I started this process from a different host in the cluster (lot of free
>>> ram space) attached to the mnesia cluster calling
>>> mnesia:change_table_frag(Table, {add_frag, [NewNode]}) for each table in
>>> order to have 2048 frags per table spread over 16 nodes.
>>> 1. The adding_fragments process took a week to rehash all the table
>>> records while working on a single core of this "maintenance" node. I read on
>>> the mnesia docs and on this list that this kind of op locks the involved
>>> table, but I was not able to parallelize on the different tables (parallel
>>> processes each running add_frag on different table) in order to take
>>> advantage of multiple cores. I had the feeling that add_frag "locks" the
>>> entire mnesia transaction manager. Any perspectives or advice on this would
>>> be greatly appreciated.
>>> 2. At the end of the frags-creation and rehashing process I noticed some
>>> size unbalancing between old and new frags so I started a consistency
>>> scanner that simply takes each record on each fragment and ensures that the
>>> mnesia_frag hashing module actually maps that record on that specific
>>> fragment. It turns out that the unbalanced frags have some records that were
>>> moved to the new destination frag during the rehashing process, but were not
>>> removed from the old source frag! I thought mnesia:change_table_frag(Table,
>>> {add_frag, [NewNode]}) was running in atomic transaction context, has anyone
>>> ever faced with something like this?
>>> Thank you
>>> --
>>> textonpc@REDACTED
>>> atessaro@REDACTED
>>>
>>> _______________________________________________
>>> erlang-questions mailing list
>>> erlang-questions@REDACTED
>>> http://erlang.org/mailman/listinfo/erlang-questions
>>>
>>
>
>
>
> --
> textonpc@REDACTED
> atessaro@REDACTED
>
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://erlang.org/mailman/listinfo/erlang-questions
>
>