[erlang-questions] question how to recover a 'stateful' app when Erlang node crashes?
Jesper Louis Andersen
jesper.louis.andersen@REDACTED
Tue Jan 17 22:19:38 CET 2012
On 1/16/12 10:52 PM, Roman Shestakov wrote:
>
> what is the correct way to recover "stateful" Erlang application? In
> my case, the app. which is crashing is a complex hierarchy of
> fsm_processes each containing certain state. I understand how to
> recover stateless processes with supervisors but what is the correct
> way to recovery stateful apps? Clearly in my case I probably need some
> kind of supervisor 'node' but what would be the steps to correctly
> recover killed processes with their states? do I need to use a db and
> replay the processes from disk on another node or can I have a node
> with identical processes hierarchy?
>
The problem with a crashing process is that its internal state is not
sound anymore. There was a reason as to why it went wrong. The problem
with a crashing node is largely the same. There is a reason you ended up
with resource exhaustion in the first place.
The trick is that there is no trick. You need another node to have your
state or you need your state on stable storage once in a while so you
can restart from it. The point is that you can then make sure that from
this stable state there will be no trouble. Essentially you want to only
store to disk when you are sure about some part of the system is
consistent with your invariants. Or move your state to another node.
--
Jesper Louis Andersen
Erlang Solutions Ltd., Copenhagen, DK
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20120117/45e1624f/attachment.htm>
More information about the erlang-questions
mailing list