[erlang-questions] Idea on stateful server load sharing & fail over

Wed Apr 7 21:09:56 CEST 2010

Thank you Evans for the reply.

We are doing process per call instead of process per user because not
all users make/receive calls at the same time.

We can save all call sate information to shared/distributed database,
like Mnesia. There are two issues to be addressed, for example, server
A crashes or be removed from the cluster, which server in group will
re-construct the gen_gsm/gen_server from the crahsed server A. Saving
all call states in all server to distributed database is not optimal
also.

Another problem as I stated, how to dispatch the request to back end
servers. Basic hash is not enough because back end server can come up
or goes away.

kaiduan

On Wed, Apr 7, 2010 at 2:42 PM, Evans, Matthew <mevans@REDACTED> wrote:
> Hi,
>
> Is there any reason why you couldn't have a "back end server" process per user? That would be the Erlang way of doing things. Even if there are millions of users/processes you should be fine.
>
> The "back end server" process would save state in its state record (assuming it is a gen_server or gen_fsm), and when it is in a stable state it could checkpoint that information to ets, mnesia or a peer process.
>
> You could have an ETS table that maps the call id string to the back end server pid. And use erlang:monitor to determine if the back end server dies. If it does, restart it or map the record in the ets table elsewhere.
>
> Matt
>
> -----Original Message-----
> From: erlang-questions@REDACTED [mailto:erlang-questions@REDACTED] On Behalf Of Kaiduan Xie
> Sent: Wednesday, April 07, 2010 1:39 PM
> To: erlang-questions
> Subject: [erlang-questions] Idea on stateful server load sharing & fail over
>
> Hi, all,
>
> Consider the following case, a system consists of a farm of state-full
> call servers, and needs to support massive number of users, and zero
> down time or at least 5-9 availability. The number of servers are
> dynamic, in other words, server can come up and go away. The server is
> a state-full server, it stores information of the call, the state
> should be able to survive server crash so that in-call feature can be
> supported after crash. The system has two tiers architecture. At the
> front, a dispatcher dispatches the incoming request to the back end
> state-full servers, the dispatcher is stateless. So the following
> questions comes,
>
> 1. How to dispatch the incoming request to back end servers? Please
> note that the back-end server is a state-full server, for example, all
> requests in a call should be dispatched to the same server. The call
> is identified with a call id that is a random string. Also the back
> end server can be added or removed dynamically.
>
> 2. How to replicate the call state information among back end servers?
>
> 3. How front end server detects a back end server is down? How back
> end servers detects that one of their peer is down? Can we use
> erlang's monitor in distribution to achieve that?
>
> Thanks all for the pointers and thoughts,
>
> Kaiduan
>
> ________________________________________________________________
> erlang-questions (at) erlang.org mailing list.
> See http://www.erlang.org/faq.html
> To unsubscribe; mailto:erlang-questions-unsubscribe@REDACTED
>
>