[erlang-questions] About behavior of OTP's supervisor-worker architecture

Mon Sep 13 23:24:18 CEST 2010

Hi Tushar,

The OTP supervisor module does not support your desired behaviour, but you
can use the supervisor and the monitor facility of Erlang to implement it.

A rough design:
The supervisor should have dynamic children and in the function where the
call to supervisor:start_child/2 is done the pid of the started child should
be passed on to a gen_server process on the side that will call
erlang:monitor/2.
When the child dies the monitoring process will get a 'DOWN' message which
ought to contain enough information to start a new process - you just have
to include the state data of the process in the Reason for termination.

So you have to write some code to make it work, but you can reuse the
supervisor and implement the monitoring process using gen_server. If you are
really keen on this you can even implement your own OTP module with its own
behaviour and all, but I would recommend that you get the thing to work
first in order to avoid too many balls in the air in the beginning.

Cheers,
Torben

On Mon, Sep 13, 2010 at 22:47, Tushar Deshpande <tushar.erlang@REDACTED>wrote:

> Hi,
>
> I've a question about OTP's supervisor-worker architecture.
>
> I understand that OTP allows us to write fault-tolerant apps.
> This is made possible by supervisor-worker architecture.  A
> supervisor manages several workers.  If a worker (or a group
> of workers) fails then supervisor is able to restart it.  The worker
> is restarted and it resumes with the same state that it had
> before crash.
>
> Now, let's consider following situation.
>
> A worker process has two possible implementations, P and Q.
> Worker P runs under normal conditions.  Worker Q is supposed
> to run in case P fails.
>
> If worker P crashes then, supervisor is notified about the crash.
> Typically, the supervisor would restart worker P.
>
> But, I would like the supervisor to behave in a different manner.
> In case the worker P fails, the supervisor should start the worker Q.
> The worker Q should begin its execution with the same state that
> P had at the point of crash.
>
> Is it possible to write an OTP application that does this?  If yes,
> then do I need to customize the supervisor code.
>
>
> Best Regards,
>
> Tushar Deshpande
>

-- 
http://www.linkedin.com/in/torbenhoffmann