tracking failures in supervisor:start_link

Gunilla Arendt gunilla@REDACTED
Thu Feb 2 13:20:35 CET 2006


Samuel Rivas wrote:
 > Hi,
 >
 > If any of the child processes dies in the init phase when a supervisor is
 > created with start_link, the supervisor terminates itself with the
 > reason shutdown. I'd prefer the exit message to propagate untouched, so
 > I need to know the initial exit reason.
 >
 > The only way I came up with is creating the supervisor with an empty
 > child list, starting the children afterwards with the start_child
 > function. That way the supervisor returns {error, Reason} if something
 > fails in the children's init functions. This is not perfect since I
 > cannot know whether the child process died or returned {error, Reason}
 > but is better than the initial case.
 >
 > Any cleaner way to do that?

Basically, the whole idea with the OTP supervisor is to *not* propagate
the error. Snip from OTP Design Principles: "The basic idea of a
supervisor is that it should keep its child processes alive by
restarting them when necessary."

If the supervisor ends up in an unrecoverable situation, e.g. if a
child process fails to start or if the maximum restart frequency is
exceeded, the supervisor terminates its child processes and then itself
with reason shutdown.

So the short answer to your question is 'no'.

I would recommend you to implement your own supervisor, or, if
the important thing for you is to *see* the actual error reason, not
to propagate it, start the SASL application. It adds an event handler to
error_logger which prints out error information when behaviour processes
(gen_servers etc) terminates.

8> catch sup:start_link(fail).
** exited: shutdown **

=CRASH REPORT==== 2-Feb-2006::13:11:25 ===
   crasher:
     pid: <0.71.0>
     registered_name: gens
     error_info: fail
     initial_call: {gen,init_it,
                       [gen_server,
                        <0.70.0>,
                        <0.70.0>,
                        {local,gens},
                        gens,
                        [gens,fail],
                        []]}
     ancestors: [armitage,<0.57.0>]
     messages: []
     links: [<0.70.0>]
     dictionary: []
     trap_exit: false
     status: running
     heap_size: 233
     stack_size: 21
     reductions: 97
   neighbours:

=SUPERVISOR REPORT==== 2-Feb-2006::13:11:25 ===
      Supervisor: {local,armitage}
      Context:    start_error
      Reason:     fail
      Offender:   [{pid,undefined},
                   {name,gens},
                   {mfa,{gens,start_link,[gens,fail]}},
                   {restart_type,permanent},
                   {shutdown,2000},
                   {child_type,worker}]

/ Gunilla




More information about the erlang-questions mailing list