[erlang-questions] Distributed apps when application terminates

Phillips, Christopher Christopher.Phillips@REDACTED
Wed Sep 25 21:11:23 CEST 2013

  The node appears to have gone down; if I'm attached to it I get dropped back to the shell. If I check for running Erlang processes (ps aux | grep beam) I see nothing. The other node received a 'nodedown' message. That's what's confusing me; if it was still up I'd understand, and in the past when just stopping the application manually I accepted it not failing over. This is a bit different.

From: Yogish Baliga <yogishb@REDACTED<mailto:yogishb@REDACTED>>
Date: Wednesday, September 25, 2013 2:57 PM
To: Chris Phillips <christopher.phillips@REDACTED<mailto:christopher.phillips@REDACTED>>
Cc: "erlang-questions@REDACTED<mailto:erlang-questions@REDACTED>" <erlang-questions@REDACTED<mailto:erlang-questions@REDACTED>>
Subject: Re: [erlang-questions] Distributed apps when application terminates

According to distributed app documentation:

If the node where the application is running goes down, the application is restarted (after the specified timeout) at the first node, specified by the distributed configuration parameter, which is up and running. This is called a failover.

In your case, your node did not go down but supervisor is stopped. I did a test in the past of the application fail over by disabling ethernet adapater on the master node.

-- baliga

On Wed, Sep 25, 2013 at 11:28 AM, Phillips, Christopher <Christopher.Phillips@REDACTED<mailto:Christopher.Phillips@REDACTED>> wrote:
I have a release built around a distributed application.

If I spin two nodes up, things are configured properly such that if I attach and q() out of the node the application is actively running on, failover occurs, the application starts up on the other node.

What I'm finding is that in the same situation, if I kill the top level supervisor (either by directly sending it an exit message, or having a child fail enough times to pass the max restart threshold), I _don't_ fail over. I do, however, receive a node down message on the other node. I'm wondering if this is intentional, a bug, or if I'm doing something wrong.

erlang-questions mailing list

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20130925/ff616216/attachment.htm>

More information about the erlang-questions mailing list