[erlang-questions] supervisor process not responding to messages ('EXIT', which_children, etc)

Scott Lystig Fritchie fritchie@REDACTED
Thu Apr 29 21:11:51 CEST 2010


Garret Smith <garret.smith@REDACTED> wrote:

gs> Say enough workers under child_sup_2 die in a short time to exceed
gs> the restart limit.  child_sup_2 then exits as expected.
gs> app_supervisor then restarts child_sup_2 as expected.  child_sup_2
gs> takes too long to restart, so app_supervisor kills it during init,
gs> also terminating any workers that had started.

Er, but the last round of correspondence on this topic found that there
isn't a timeout during worker init, so how that child_sup_2 get killed
during its init?  Am I misunderstanding something?

gs> What I have observed is that app_supervisor is deadlocked in
gs> proc_lib:sync_wait/2.  It no longer responds to any messages: 'EXIT'
gs> signals from other children, which_children messages from
gs> supervisor:which_children, etc.  I am pretty sure that this is not
gs> intended behavior...

It's a case of "Doctor, it hurts when I do this".  The supervisor will
become responsive again when the child's init is finished.  The child's
init is taking too long, so make it shorter.

-Scott


More information about the erlang-questions mailing list