[erlang-questions] mnesia recovery

Dan Gudmundsson <>
Fri Jul 16 08:23:20 CEST 2010


Also you can bootstrap a disc_node like you do with a ram_node..

Delete everything on the mnesia_dir (not the dir), start mnesia and
call mnesia:change_config(extra_db_nodes, [AliveAndKicking]).

That should copy every table that the node should have a copy of and
 if it is available in the system

/Dan

On Fri, Jul 16, 2010 at 5:24 AM, Igor Ribeiro Sucupira <> wrote:
> On Thu, Jul 15, 2010 at 10:41 AM, Evans, Matthew <> wrote:
>> Hi,
>>
>> This is a rather convoluted question.
>>
>> We have a distributed system with disc copies/disc only copies of mnesia tables on nodes A and B.  Other nodes in the system C
>> through M have RAM only copies of those tables.
>
> I'm assuming all nodes have exactly the same tables and the same data
> (including the schema). Is that the case? If it's not, could you
> describe the pool in more detail?
>
>> Ordinarily if node A fails and recovers shortly later we are fine since mnesia is smart enough to re-sync data back to node A from
>> node B.
>>
>> We hit a situation yesterday where node A failed, some time later the whole distributed system was restarted but node B never
>> recovered.
>
> What does that mean? Is node B corrupted? Or is it just refusing to
> start because the other nodes are down and B is not the most
> up-to-date node? I don't see any other case for "never recovered" and
> I'm assuming you have the former (corruption), since you said the
> other nodes were restarted and that B has "good" data.
>
>> The logic is such that startup is effectively blocked since we know the "good" data is on node B.
>>
>> How to handle this in the field? If, for reasons beyond our control node B can not be recovered easily, I am wondering is there a
>> way to get the data from node B to node A (I am assuming we can access the partition on node B)?
>
> Assuming B is the most up-to-date node and has some corrupted tables,
> you can copy the working files of those tables from some other node to
> node B (yeah... they may be outdated, but there's not much to do in
> this case) and than start node B. Everything should work fine.
>
> If that's not your problem, maybe this function could help you, anyway:
> http://erlang.org/doc/man/mnesia.html#force_load_table-1
>
> I've used force_load_table/1 in situations where Mnesia was refusing
> to load the table in some node because it believed its copy was not
> current (but I knew it was).
>
> Good luck.
> Igor.
>
>> Would it be possible to:
>>
>> 1) Stop mnesia on all nodes
>> 2) Copy the contents of the mnesia directory from node B to node A (minus the schema definitions)
>> 3) Empty the mnesia directory from node B
>> 4) Restart everything
>>
>> In this case I am hoping that mnesia would see node A as good and node B as having no data and would copy data to the new
>> node B.
>>
>> Basically this situation needs to be resolved on the field by engineers with little or no Erlang skills. Certainly escripts could be
>> written to help.
>>
>> Thanks
>>
>> Matt
>
> ________________________________________________________________
> erlang-questions (at) erlang.org mailing list.
> See http://www.erlang.org/faq.html
> To unsubscribe; mailto:
>
>


More information about the erlang-questions mailing list