[erlang-questions] Mnesia: strategy for auto-recovery from netsplit

Daniel Dormont dan@REDACTED
Tue May 7 17:51:30 CEST 2013


It is. For example the mappings between Jabber IDs of various kinds (user,
chatroom, etc) and process IDs are kept in Mnesia tables which are
distributed - in fact this is really the core of how clustered ejabberd
works. So I will really need to do something here.

A brief past experiment suggested that ejabberd did not take kindly to a
Mnesia restart on a live node - I think I will have to restart the node.

A related question while I'm thinking of it - are there any modules out
there that can hook into the error logger (or configuration options in the
error logger) and do something different with certain log messages - for
example send them by email?

Dan


On Fri, May 3, 2013 at 12:02 PM, Michael Truog <mjtruog@REDACTED> wrote:

>  The only solution seems to be https://github.com/uwiger/unsplit usage
> where you manually resolve any conflicts.  Someone may already have
> integration with ejabberd that is available, but the problem of which side
> of the nodesplit to take should be error-prone, difficult, and sometimes
> impossible (depending on the data stored).  I think it is simpler to just
> hookup ejabberd to postgres or mysql instead of the mnesia usage.  There
> still remains mnesia usage internally, but I don't think the internal
> mnesia usage that doesn't go to postgres or mysql is distributed (would be
> good to check).
>
>
> On 05/03/2013 08:32 AM, Daniel Dormont wrote:
>
> Hi Erlangers,
>
>  I'm running ejabberd with a two-node cluster in my production
> environment. Today that system encountered a netsplit. It was properly
> recorded and logged. But I need to work on some way to automate a solution
> for this. I'm aware that the problem can't be solved in general, but there
> are two mitigating factors in my case:
>
>  1 - Almost all of my tables are RAM-only.
> 2 - None of the data are truly critical for me. That is, loss of some
> portion of the data isn't critical because my application can recover.
>
>  So in this case, I just picked a node, restarted ejabberd on it, and all
> is well. But what I'd like to do is write some actual Erlang code that can
> subscribe to the Mnesia   partitioned network event and do something about
> it. What are my options there?
>
>  thanks,
> Dan
>
>
> _______________________________________________
> erlang-questions mailing listerlang-questions@REDACTED://erlang.org/mailman/listinfo/erlang-questions
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20130507/bf049b4a/attachment.htm>


More information about the erlang-questions mailing list