[erlang-questions] Network partition scenario
Scott Lystig Fritchie
fritchie@REDACTED
Fri Sep 19 06:34:35 CEST 2008
JasonGanetsky <jason.ganetsky@REDACTED> wrote:
jg> I will handle this, basically, by
jg> shutting down the application on both nodes, clearing mnesia (which
jg> is acceptable in this case), restarting mnesia, and then restarting
jg> my application.
Out of curiousity ... what does "clearing mnesia" mean? Starting from
scratch, deleting all Mnesia data? Or something else?
jg> I will not use mnesia:set_master_nodes(), as it
jg> apparently causes the inconsistent_database message to be
jg> suppressed.
While the network partition was in effect, transactions on both sides
may have done globally-inconsistent things ... but one won't know that
until the partition is healed.
jg> My question is: how do I get them to reconnect? Should I do this by
jg> simpling calling net_adm:ping() on the other node regularly? Or is
jg> there a better way? Also, am I correct in assuming that restarting
jg> mnesia will cause them to re-sync?
You'll need some excuse for one to communicate with the other. If
you're using default value of "-kernel dist_auto_connect" (not "once" or
"false", see net_kernel(3)), net_adm:ping() is good enough.
Upon restarting, the local Mnesia instance will need to contact other
transaction managers to calculate the fate of any unresolved
transactions. That need will trigger re-connecting if dist_auto_connect
is true.
-Scott
More information about the erlang-questions
mailing list