Standby system handover
Serge Aleynikov
serge@REDACTED
Fri Jul 7 13:50:19 CEST 2006
Chaitanya,
Wouldn't your packet forwarder be a single point of failure then?
Unless you have some balancing capability at the client with added
intelligence of removing unavailable packet forwarders from the list of
known servers. Though this increases the complexity of the client's
implementation.
The way we deal with a similar problem domain is as follows. Two hosts
and two routers are selected to form a redundant cluster. Each host has
two NICs and a serial crossover link to its mate server. Ethernet
bonding kernel module is configured to form a bonding interface with a
*single* MAC address assigned to both NICs in the active/standby link
configuration. Each NIC is connected to a router (CISCO 3750, or alike)
running HSRP protocol. The server-side connections are placed in the
same VLAN. This gives Layer 2 redundancy and resilience to router
failures or NIC failures:
| L A N |
| |
+----+----+ HSRP +----+----+
| CISCO +------+ CISCO |
| +------+ |
+----+----+ +----+----+
| \ / |
| \ / |
| \/ |
| VIP1 /\ VIP2 |
| / \ |
+-+-----+ +-----+-+
|Server1| |Server2|
+----+--+ +---+---+
| |
+-----------+
Serial Hartbeat Link
Secondly, we use http://linux-ha.org project for virtual IP management.
Two VIPs are configurd that are owned one per active server. Clients
talk to servers through these VIPs. Additionally a serial hartbeat link
is used as a separate hardware path between VIP management software for
heath checks. In the event that a server goes down or a service goes
down for maintenance, the VIP owned by the server gets migrated to the
other server until the former owner becomes available. This gives us
Layer 3 redundancy.
As far as the server application is concerned, do you need the servers
running in the active-standby mode or load sharing? If the servers need
to share some state, you could store that state in mnesia, and have it
replicated between both nodes, while both nodes would be available
sharing the workload.
In this configuration there may not even be a need for a separate packet
forwarder, as each TCP server would simply listen for incoming packets
on the "0.0.0.0" address (which covers all VIPs currently managed by the
server), and do its job, or be a protocol converter between some TCP
protocol and Erlang terms forwarded to another Erlang gen_server process
running in the cluster using some balancing method such as pg2 application.
This approach works well for us for making highly available server
processes.
Regards,
Serge
Chaitanya Chalasani wrote:
> Hi,
>
> We currently have a non-realtime standby server for our mission critical
> application. In our struggle to make a realtime standby system my job is to
> develop an application that does TCP/IP packet forwarding to the active
> server and in an event of active server unavailability it would send the
> packet to any of the configured standby server. The application is not
> written in erlang, but I am planning to write the soft handover application
> in erlang. I am attaching a test module for the same which serves the purpose
> but I wanted to know a better design for the same or any feature in
> erlang/OTP that can help me build more robust application.
>
>
>
> ------------------------------------------------------------------------
>
> -module(routerApp).
> -compile(export_all).
>
>
> listener(PortOwn,ClusterList) ->
> case gen_tcp:listen(PortOwn,[binary,{packet,0},{keepalive,true},{reuseaddr,true}]) of
> {ok,LSock} ->
> acceptConnections(LSock,ClusterList),
> gen_tcp:close(LSock),
> listener(PortOwn,ClusterList);
> Other ->
> io:format("Received ~p~n",[Other])
> end.
>
> acceptConnections(LSock,ClusterList) ->
> case gen_tcp:accept(LSock) of
> {ok,Sock} ->
> Pid = spawn(routerApp,clientConnectionThread,[Sock,ClusterList]),
> gen_tcp:controlling_process(Sock,Pid ),
> acceptConnections(LSock,ClusterList);
> {error,Reason} ->
> io:format("Unknown error ~p~n",[Reason]);
> Other ->
> io:format("Unknown responce ~p~n",[Other])
> end.
>
> clientConnectionThread(Sock,[{Ipaddress,Port}|ClusterList]) ->
> case gen_tcp:connect(Ipaddress,Port ,[binary,{active,true}] ) of
> {ok,ClustSock} ->
> case listenForMessages(Sock,ClustSock) of
> {error,clusterNodeClosed} ->
> clientConnectionThread(Sock,ClusterList++[{Ipaddress,Port}]);
> {error,clientClosed} ->
> io:format("Client Closed and so parallel thread dieing~n");
> Other ->
> io:format("Unknown error ~p and so parallel thread dieing~n",[Other])
> end;
> Other1 ->
> io:format("Received ~p while connecting to ~p ~n",[Other1,{Ipaddress,Port}] ),
> clientConnectionThread(Sock,ClusterList++[{Ipaddress,Port}])
> end.
>
> listenForMessages(Sock,ClustSock) ->
> receive
> {tcp_closed,ClustSock} ->
> gen_tcp:close(ClustSock),
> gen_tcp:close(Sock),
> {error,clientCloses};
> %{error,clusterNodeClosed};
> {tcp_closed,Sock} ->
> gen_tcp:close(Sock),
> gen_tcp:close(ClustSock),
> {error,clientClosed};
> {tcp,ClustSock,Data} ->
> io:format("Received ~w from server socket~n",[Data]),
> gen_tcp:send(Sock,Data ),
> listenForMessages(Sock,ClustSock);
> {tcp,Sock,Data} ->
> io:format("Received ~w from client socket~n",[Data]),
> gen_tcp:send(ClustSock,Data ),
> listenForMessages(Sock,ClustSock);
> Other ->
> io:format("Received unknown info ~p~n",[Other]),
> {error,unknown}
> end.
More information about the erlang-questions
mailing list