[erlang-questions] Idea on stateful server load sharing & fail over

Kaiduan Xie <>
Wed Apr 7 21:09:56 CEST 2010

Thank you Evans for the reply.

We are doing process per call instead of process per user because not
all users make/receive calls at the same time.

We can save all call sate information to shared/distributed database,
like Mnesia. There are two issues to be addressed, for example, server
A crashes or be removed from the cluster, which server in group will
re-construct the gen_gsm/gen_server from the crahsed server A. Saving
all call states in all server to distributed database is not optimal

Another problem as I stated, how to dispatch the request to back end
servers. Basic hash is not enough because back end server can come up
or goes away.


On Wed, Apr 7, 2010 at 2:42 PM, Evans, Matthew <> wrote:
> Hi,
> Is there any reason why you couldn't have a "back end server" process per user? That would be the Erlang way of doing things. Even if there are millions of users/processes you should be fine.
> The "back end server" process would save state in its state record (assuming it is a gen_server or gen_fsm), and when it is in a stable state it could checkpoint that information to ets, mnesia or a peer process.
> You could have an ETS table that maps the call id string to the back end server pid. And use erlang:monitor to determine if the back end server dies. If it does, restart it or map the record in the ets table elsewhere.
> Matt
> -----Original Message-----
> From:  [mailto:] On Behalf Of Kaiduan Xie
> Sent: Wednesday, April 07, 2010 1:39 PM
> To: erlang-questions
> Subject: [erlang-questions] Idea on stateful server load sharing & fail over
> Hi, all,
> Consider the following case, a system consists of a farm of state-full
> call servers, and needs to support massive number of users, and zero
> down time or at least 5-9 availability. The number of servers are
> dynamic, in other words, server can come up and go away. The server is
> a state-full server, it stores information of the call, the state
> should be able to survive server crash so that in-call feature can be
> supported after crash. The system has two tiers architecture. At the
> front, a dispatcher dispatches the incoming request to the back end
> state-full servers, the dispatcher is stateless. So the following
> questions comes,
> 1. How to dispatch the incoming request to back end servers? Please
> note that the back-end server is a state-full server, for example, all
> requests in a call should be dispatched to the same server. The call
> is identified with a call id that is a random string. Also the back
> end server can be added or removed dynamically.
> 2. How to replicate the call state information among back end servers?
> 3. How front end server detects a back end server is down? How back
> end servers detects that one of their peer is down? Can we use
> erlang's monitor in distribution to achieve that?
> Thanks all for the pointers and thoughts,
> Kaiduan
> ________________________________________________________________
> erlang-questions (at) erlang.org mailing list.
> See http://www.erlang.org/faq.html
> To unsubscribe; mailto:

More information about the erlang-questions mailing list