[erlang-questions] how: mnesia with simultaneous permanent node failure (EC2)

Scott Lystig Fritchie fritchie@REDACTED
Sat Dec 1 21:14:27 CET 2007


>>>>> "pm" == Paul Mineiro <paul-trapexit@REDACTED> writes:

pm> however when i tried to apply the procedure to simultaneous loss
pm> of two nodes, i ran into a problem; calling
pm> mnesia:del_table_copy/2 of schema requires all other nodes to be
pm> active, and is this scenario i have lost two nodes simultaneously
pm> (attached as test-disaster-two).

Yup, I've also discovered that feature.  Heh, some colleagues have run
into problems when they've intentionally created such a failure via
the "rm -rf" trick.  I had to explain that just because node C in a
cluster lost it's mind doesn't mean that A & B & others haven't
forgotten that C is a member.

My solution was to create a backup of each node after a schema change,
and use that backup to bootstrap a dead node back to life ... then use
mnesia:del_table_copy() if I really wanted to remove that node from
the cluster.

An interesting experiment that I haven't tried ... Mnesia backups, by
default, include all tuples from all tables.  Either by using a custom
backup module, or by post-processing a backup via
mnesia:traverse_backup(), to include only the schema table.  I wonder
if restoring only the schema table on each formerly-dead node is
sufficient?

-Scott



More information about the erlang-questions mailing list