How to do a takeover?
Ulf Wiger (AL/EAB)
ulf.wiger@REDACTED
Wed Mar 16 09:36:19 CET 2005
One way to handle takeover is to use globally
registered names, as you suggest.
A reasonably simple way to coordinate transfer
during takeover is to use the following sequence:
(Context, application instance {A, N1} is supposed to
take over from {A, N2}, where N1, N2 are node names.)
1) From e.g. a start_phase function, issue a gen_server
call to processes in {A, N1} that should transfer
global names and state.
2) In the handle_call for each process P,
a) first call global:re_register_name() to move
the name from {A,N2} to {A,N1}. This should
prevent new requests from coming in to {P,N2}
b) Issue a gen_server:call({P,N2}, takeover_state),
which signals the process locally registered as P
on node N2 to hand over the state. This message
should be processed _after_ all external requests
in {P,N2}, due to the FIFO semantics of gen_server.
c) {P,N1} now has the global name and the state, and
can return control to the gen_server.
d) {P,N2} could perhaps relay all further messages
to {P,N1} until it is terminated, but it shouldn't
be strictly necessary.
One thing to consider is that each "top application" is
moved as one entity, and as soon as takeover is finished,
the old instance is terminated. This can cause problems
in applications that assume that they will always be on
the same node (e.g. applications using SNMP).
One way to address this is to introduce a wrapper
application that includes all such applications. This
has been done for many years in e.g. AXD 301. In AXD 301,
this didn't solve all problems, so we made a special
"distributed application controller" that is able to
coordinate takeover of several applications in parallel.
For the longest time, though, making one large application
of included O&M applications worked just fine.
/Uffe
> -----Original Message-----
> From: owner-erlang-questions@REDACTED
> [mailto:owner-erlang-questions@REDACTED]On Behalf Of Anders Nygren
> Sent: den 16 mars 2005 01:16
> To: erlang-questions@REDACTED
> Subject: How to do a takeover?
>
>
> Hi
> I have started looking at how to do a takeover.
> In my tests I discovered that I need to do a global:reregister_name
> to "move" the name of my server to the new node, and how to move some
> state information from the old to the new node.
>
> The way my current design works is that I have a number of
> gen_servers.
> When one server needs to call another server it spawns a
> worker that makes the
> call, so as to not block. This gives me the problem that I
> have a lot of workers
> that must terminated correctly before the takeover can be finalized.
>
> But I dont understand how to make a controlled transfer of
> messages in the
> mailbox, or linked worker processes.
>
> My best guess now is that I have to stop using
> gen_server:call to other servers
> and instead use an interface between my servers that always
> sends messages to
> processes with globally registered names.
>
> /Anders Nygren
>
More information about the erlang-questions
mailing list