restarting child processes

Lennart Ohman <>
Wed Oct 19 23:27:15 CEST 2005


Hi Chris,
the childprocess (or worker) crashes in its init-phase. You
can se this by examining the Context field in the supervisor
report (start_error). The supervisor will then consider itself
to be a failure and crash in its own initphase.
As a note, your child is actually not a proper OTP process either,
or to put it in a more polite way :-) one can say that you have
choose to role-your-own by not using for instance the gen_server
behaviour.

The convention is that you write code for a gen_server by having
a start_link function which calls gen_server:start_link.
Then you must have an init function in the call-back module
mentioned in the gen_server:start_link call. The init function
must return one of a set of allowed return values. When for
instance {ok,InitialLoopDataStructure} is returned, the gen_server
process enters its working state and will be restarted (if configured
so in the supervisor) in case it "missbehaves".

Best Regards,
Lennart

-------------------------------------------------------------
Lennart Ohman                   office  : +46-8-587 623 27
Sjoland & Thyselius Telecom AB  cellular: +46-70-552 67 35
Sehlstedtsgatan 6               fax     : +46-8-667 82 30
SE-115 28, STOCKHOLM, SWEDEN    email   : 


-----Original Message-----
From: 
[mailto:] On Behalf Of Chris Campbell
Sent: Wednesday, October 19, 2005 11:00 PM
To: 
Subject: restarting child processes

Hi,

I'm playing with the supervisor behaviour but for some reason it
always terminates after a problem with the child.  The program is a
little contrieved to learn about the behaviour.  The child is to
terminate if it doesn't receive a message within 750ms.  The
supervisor should restart it indefinitely, instead it terminates.


Here is the child module.

% swf_kid.erl
-module(swf_kid).
-export([start_child/0, child_work/0, stop_child/1]).


start_child() ->
    child_work().

child_work() ->
    receive
        stop ->
            io:format("stopping!~n"),
            ok;
        Others ->
            child_work()
    after 750 ->
            io:format("child exiting~n"),
            exit(blah)
    end.

stop_child(C) ->
    C ! stop.

and here is the supervisor.

% swf_supervisor.erl
-module(swf_supervisor).
-behaviour(supervisor).
-export([start_link/0, init/1]).


start_link() ->
    supervisor:start_link(swf_supervisor, []).

init(_X) ->
    {ok, {{one_for_one, 50, 1},
          [{kid, {swf_kid, start_child, []},
            permanent, brutal_kill, worker, []}]}}.

This gives the following error (with sasl)...

> swf_supervisor:start_link().
child exiting

=SUPERVISOR REPORT==== 19-Oct-2005::21:52:53 ===
     Supervisor: {<0.256.0>,swf_supervisor}
     Context:    start_error
     Reason:     {'EXIT',blah}
     Offender:   [{pid,undefined},
                  {name,kid},
                  {mfa,{swf_kid,start_child,[]}},
                  {restart_type,permanent},
                  {shutdown,brutal_kill},
                  {child_type,worker}]

** exited: shutdown **

Why isn't the child being restarted?


Regards,
Chris





More information about the erlang-questions mailing list