[erlang-questions] mnesia race condition in add_table_copy

karol skocik karol.skocik@REDACTED
Fri Jul 8 11:25:10 CEST 2011


I will do that later this evening, unless I find a fix for it.
Unfortunately to reproduce it, it's "non-small" amount of setup code
for cluster/mnesia startup I will need to separate.
Thanks, Karol

On Fri, Jul 8, 2011 at 10:38 AM, Dan Gudmundsson <dangud@REDACTED> wrote:
> Can you send me (privately) a small test program that shows the problem.
>
> /Dan
>
> On Fri, Jul 8, 2011 at 10:30 AM, karol skocik <karol.skocik@REDACTED>
> wrote:
>>
>> In addition to this, the obvious question:
>> what should I do to ensure that schema change is propagated after
>> mnesia:add_table_copy?
>>
>> Or alternatively, what should I call **before**
>> mnesia:table_info(Table, disc_copies) to get all nodes where the copy
>> resides?
>>
>> Thanks for any suggestions,
>> Karol
>>
>> On Thu, Jul 7, 2011 at 7:01 PM, karol skocik <karol.skocik@REDACTED>
>> wrote:
>> > Hi,
>> >  I think I have found a race condition in mnesia:add_table_copy.
>> > I am trying to add table copy, when new node appears in cluster (or
>> > add table copy to another node, when the one having a copy fails), and
>> > the number of copies is less than some required count.
>> >
>> > The idea is simple, I spawn a new process on every node in cluster
>> > first, and in these processes I want to create a global transaction
>> > using global:trans with ID = {add_table_trans, table_name}.
>> > The first process which grabbed the transaction lock, checks if more
>> > table copies are required, and creates new copy on some node not
>> > having one, when needed.
>> > When the copy is created, this process exits, and another process on
>> > different node gets the transaction lock and tries to do the same.
>> >
>> > The problem here is, that the second process checks where are the
>> > copies using mnesia:table_info(table_name, disc_copies), and this list
>> > is sometimes incomplete, missing the very last node which got a table
>> > copy in the first process.
>> > It can be verified easily - in the second process:
>> > Copies1 = mnesia:table_info(table_name, disc_copies),
>> > timer:sleep(2000),
>> > Copies2 = mnesia:table_info(table_name, disc_copies).
>> >
>> > Then, mnesia:add_table_copy fails with
>> > {aborted,{already_exists,table_name,LastAddedNode}}
>> >
>> > Since the transaction lock ensures that no other process can add
>> > another table copy, I guess this is a race condition where new table
>> > copy node is not propagated to the schema on all nodes before the
>> > function mnesia:add_table_copy returns.
>> >
>> > Karol
>> >
>> _______________________________________________
>> erlang-questions mailing list
>> erlang-questions@REDACTED
>> http://erlang.org/mailman/listinfo/erlang-questions
>
>



More information about the erlang-questions mailing list