[erlang-questions] Takeover failure

Tyron Zerafa tyron.zerafa@REDACTED
Sun Dec 1 17:19:25 CET 2013


Hi all,

    I am trying to understand how to implement takeover in Erlang by
following the example presented
here<http://learnyousomeerlang.com/distributed-otp-applications>.
Basically, I am creating the application's supervisor as follows;

start(normal, []) ->
m8ball_sup:start_link();
start({takeover, _OtherNode}, []) ->
m8ball_sup:start_link().


*Supervisor init code:*
start_link() ->
supervisor:start_link({global,?MODULE}, ?MODULE, []).

*Supervisor child Specification:*
{
{one_for_one, 1, 10},
[
{m8ball,
{m8ball_server, start_link, []},
permanent,
5000,
worker,
[m8ball_server]
}]
}

*Child (m8ball_server) Initialization*
start_link() ->
gen_server:start_link({global, ?MODULE}, ?MODULE, [], []).


Consider the following scenario; an Erlang cluster is composed of two nodes
A and B with application m8ball running on A.
Failover works perfect, I'm managing to kill node A and see the application
running on the next node, B.
However, when I try to put back up node A (which have a higher priority
then B) and init the app, I am getting the following error. I'm assuming
that this occurs because node B already contains a supervisor globally
registered with that name.
*Log on Node A *
{error,{{already_started,<2832.61.0>},
        {m8ball,start,[{takeover,'b@REDACTED'},[]]}}}

=INFO REPORT==== 1-Dec-2013::16:17:32 ===
    application: m8ball
    exited: {{already_started,<2832.61.0>},
             {m8ball,start,[{takeover,'b@REDACTED'},[]]}}


*Log on Node B*
=INFO REPORT==== 1-Dec-2013::16:24:55 ===
    application: m8ball
    exited: stopped
    type: temporary

When I tried registering the supervisor locally, I got a similar exception
failing to initializing the worker process. However, if I also register
this as local, I would not be able to call it from any node using the app
name (since it would not be globally registered).

*Log on Node A **(Supervisor Registered Locally)*
{error,
    {{shutdown,
         {failed_to_start_child,m8ball,
             {already_started,<2832.67.0>}}},
     {m8ball,start,[{takeover,'b@REDACTED'},[]]}}}


Any pointers?

-- 
Best Regards,
Tyron Zerafa
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20131201/20eec86a/attachment.htm>


More information about the erlang-questions mailing list