[erlang-questions] Maintaining state between application failover
Wed Aug 14 13:20:44 CEST 2013
Assuming a distributed application, how could state between application
starts due to failover be maintained?
For illustration purposes, consider the following problem:
We want a kind of server that delivers unique numbers. Starting at 0, on
each request this number is delivered and incremented.
For implementation, we use a gen_server process that keeps the current
number in it's state. We put that process under a one-for-one supervisor,
which serves as the top supervisor of the application.
Now, even in a non-distributed setup, the gen_server could not maintain the
state between restarts managed by it's supervisor. We could store the
current number in the environment of the application itself (which doesn't
feel right, but for illustration purposes let's keep it in mind), where it
would survive restarts of the gen_server process.
In a distributed setup, even the state stored in the application
environment would not survive in case of an application failover. When the
node on which the application is running dies and is restarted on another
node, it starts at 0 again.
Performance considerations aside, the current number could be constantly
kept and updated permanently in a state file which could then be read at
startup. But since the nodes would usually be running on different
machines, on failover the application would be restartet on another machine
than the one where the file resided, and since the reason for the death of
the erlang node is presumable the death of the hardware node, would not be
accessible from the application started in failover mode. For keeping the
state file accessible everywhere, we would need to put it on a NFS mount or
something, but the NFS server would become a critical component in our
setup, not to mention the overkill of running an extra machine for the
single purpose of sharing a single file which would not exceed a few bytes
in size. Using a database of whatever flavor is essentially the same.
So, how could state be efficiently maintained in the erlang way of doing
To clarify, I am *not* asking for a solution to the problem of generating
unique numbers, there a probably a thousand ways to do this in a better
way, UUID and whatnot. I am asking for ways to maintain state between
restarts of a distributed application in a failover scenario. The example
problem above is a special case of the general problem I am asking about,
made up purely for the purpose of having a simple illustration. A solution
of the general problem would automatically solve the special case, anyway ;)
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the erlang-questions