[erlang-questions] How to debug "Kernel pid terminated"

David Mercer dmercer@REDACTED
Wed May 16 21:43:06 CEST 2012


On May 16, JD Bothma wrote:

> Two nodes on the same machine only helps if the node
> crashes. That can happen, but I think that's a more serious issue that
> should be found and fixed theoretically rather than trying to do
> failover.

As we saw here, one node crashed, the other didn't.  I am trying to figure
out how to diagnose this crash.

My use case for multiple nodes on the same host is to assist in upgrades.
Rather than planning an upgrade release, it's much easier just to crash the
node and restart it.  However, when we crash it, we would like to have the
failover take over.  In this case, it's OK for the failover to be on the
same host.  We do also have a third failover on a different host in case the
main host goes down.

Over time, a few low-priority jobs have also been added to the failover's
role, so we can't just turn it off.  We could migrate them over to the main
and then shut off the failover for good, but my question now is whether
anyone sees any reason why application distribution shouldn't work in this
case.

Thanks.

Cheers,

DBM




More information about the erlang-questions mailing list