Hello,<div><br></div><div>I am running distributed mnesia on a cluster of two servers, x@y and x@z. All interaction with mnesia occurs on x@y while x@z just replicates from x@y.</div><div><br></div><div>Due to an intermittent network failure, replication had stopped. In the logs for x@z was the following message:</div>
<div><br></div><div>=ERROR REPORT==== 23-Aug-2011::23:56:37 ===</div><div>** Node 'x@y' not responding **</div><div>** Removing (timedout) connection **</div><div><br></div><div>This message went unnoticed until x@y was restarted for an unrelated reason and it x@z logged the following error at startup:</div>
<div><br></div><div>=ERROR REPORT==== 24-Aug-2011::14:22:09 ===</div><div>Mnesia('x@z'): ** ERROR ** mnesia_event got {inconsistent_database, running_partitioned_network, 'x@y'}</div><div><br></div><div><br>
</div><div>Question 1: How is it possible that only x@z detected the disconnect from x@y and x@y did not detect a disconnect from x@z?</div><div>Question 2: How is it possible that x@y did not log the running_paritioned_network error?</div>
<div><br></div><div><meta http-equiv="content-type" content="text/html; charset=utf-8">I realize that the answer to Q1 may lead to an answer to Q2, as well.</div><div><br></div><div>Note, that neither server was set as master.</div>
<div><br></div><div>Thank you.</div>