[erlang-questions] Distributed OTP Apps: Failover and Takeover

Andreas Pauley <>
Mon Feb 11 13:31:30 CET 2013


Hi everyone,

I've made a demo app of mine distributed to test failover and
takeover, after reading the "Distributed OTP Applications" chapter in
Learn you some Erlang.

The failover and takeover works great if I kill the running beam (eg.
with kill -9).

However, I tried sending kill signals to the Pids of both my
application behavior and the top supervisor that gets started by the
application.
This crashes the VM, but failover does not happen.

Is this unsupported, or should I do something to enable failover in
this scenario?

I've done a more complete writeup with code and output here:
https://github.com/apauley/dark-overlord#when-processes-die-a-guide-to-the-afterlife

But in a nutshell, I crash the VM with the commands below, and then
automatic failover to my second node does not happen:

$ ./rel/overlord/bin/overlord console
Erlang R15B03 (erts-5.9.3.1) [source] [64-bit] [smp:8:8]
[async-threads:0] [hipe] [kernel-poll:false] [dtrace]

14:03:20.581  [overlord_app] <0.56.0> || Starting app: normal
14:03:20.582  [hypnosponge_sup] <0.57.0> || Hello
from the hypnosponge supervisor
()1> Sup = pid(0, 57, 0).
<0.57.0>
()2> exit(Sup, kill).

=ERROR REPORT==== 11-Feb-2013::14:04:46 ===
** Generic server minion_supersup terminating
** Last message in was {'EXIT',<0.57.0>,killed}
** When Server state == {state,
                            {local,minion_supersup},
                            simple_one_for_one,
                            [{child,undefined,minion_makeshift_sup,
                                 {minion_makeshift_sup,start_link,[]},
                                 temporary,5000,worker,
                                 [minion_makeshift_sup]}],
                            undefined,1,3,[],minion_supersup,[]}
** Reason for termination ==
** killed
true
()3>
=INFO REPORT==== 11-Feb-2013::14:04:46 ===
    application: overlord
    exited: killed
    type: permanent

()3> {"Kernel pid
terminated",application_controller,"{application_terminated,overlord,killed}"}

Crash dump was written to: erl_crash.dump
Kernel pid terminated (application_controller)
({application_terminated,overlord,killed})

-- 
http://pauley.org.za/
http://twitter.com/apauley
http://www.meetup.com/lambda-luminaries/



More information about the erlang-questions mailing list