[erlang-questions] mnesia recover from netsplit, can't delete node from schema

Daniel Dormont dan@REDACTED
Fri Jun 29 18:31:24 CEST 2012


I tried following Ulf's suggestion of starting a fresh node with the same
node name (ejabberd@REDACTED) as the node that, according to
mnesia:table_info(tab, all) is supposed to be holding a copy of this table,
and then adding it to the cluster using extra_db_nodes. At that moment, the
new node takes about 15 seconds to complete
the  mnesia:change_config(extra_db_nodes, ['ejabberd@REDACTED']). operation, but
it does seem to succeed. At that point on the new node it shows every
table, including the ones I'm interested in, as remote.

At that point, on the new node calling

mnesia:del_table_copy(vcard, node()).

returns {aborted,{badarg,vcard,unknown}}

and calling mnesia:add_table_copy(vcard, node(), ram_copies).

returns

aborted,{system_limit,vcard,
                       {'ejabberd@REDACTED',none_active}}}

I'm not able to find any documentation on what 'none_active' means. Any
idea?

dan


On Fri, Jun 29, 2012 at 12:03 PM, Rick Pettit <rpettit@REDACTED> wrote:

>
> On Jun 29, 2012, at 10:46 AM, Daniel Dormont wrote:
>
> > Well, it appears that I may have done just that. How do I determine if
> the database is in an inconsistent state and what can be done about that?
> Again, I am ok with completely deleting certain tables and nodes from the
> schema, "brutally" if need be. Is there any way of doing such a thing short
> of wiping the entire schema and starting from scratch?
>
> Daniel,
>
> I think you might be in a situation similar (though perhaps not exactly
> like) one which I believe was solved on the Trap Exit forums:
>
>    > >>> On 24 Oct 2010, at 23:37, Jeffrey Rennie wrote:
>    > >>>
>    > >>>> I seem to be stuck in a state where I can't create a table
> because it
>    > >>>> exists, but I can't delete the table because it doesn't exist!
>
> You can view the thread @
> http://forum.trapexit.org/viewtopic.php?p=62092&sid=7a78bf70100c90aadea4267c921e662d
>
> Take a quick look and see if that sounds like the problem you are having.
>
> If so, I would pay particular attention to comments from Ulf W.
>
> -Rick
>
>
> > On Fri, Jun 29, 2012 at 10:41 AM, Rick Pettit <rpettit@REDACTED>
> wrote:
> >
> > On Jun 28, 2012, at 2:40 PM, Daniel Dormont wrote:
> >
> > > Here is the scenario that happened to me as best I can tell. I had two
> nodes in a cluster, let's call them A and B. B became unavailable for a
> while and got rebooted. When I tried to start it again, things seem to work
> except that certain tables seem not to exist any more. As far as I can
> tell, these tables used to be enabled only on B and not A, and are now in
> some sort of weird hybrid unavailable state.
> > >
> > > A is still running fine in production even with these tables missing,
> but I can't seem to get a clean start of my application (Ejabberd) on B. So
> what I figured I would do would be just start a fresh node on B, start
> Mnesia, add extra_db_nodes pointing to A and go from there. But the problem
> is A still thinks these certain tables exist only on B (they are listed as
> remote on A). Fortunately, Ejabberd is smart enough to create any tables it
> needs on startup, so I was thinking a clean start on B would do this. So I
> went into A and ran
> > >
> > > mnesia:del_table_copy(schema, B).
> > >
> > > thinking this would make the remote tables sort of go away. But
> instead it fails with
> > >
> > > {aborted,{no_exists,vcard_search}}
> > >
> > > And trying to delete the table directly yields the same result.
> > >
> > > Is there a way I can force Mnesia on A to completely forget about a
> set of remote tables (and, for that matter, the node that was supposed to
> store them) before I bring a new node online?
> >
> > You might want to take a look at the documentation for
> mnesia:set_master_nodes/1,2 and maybe mnesia:force_load_table/1.
> >
> > Just make sure you understand exactly how these work before using either
> in production--if used incorrectly, you could leave the database in an
> inconsistent state.
> >
> > Hope that helps,
> >
> > -Rick
> >
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20120629/fdd28547/attachment.htm>


More information about the erlang-questions mailing list