[erlang-questions] release handler crash during relup, sys:get_status noproc

Richard Jones rj@REDACTED
Mon Aug 24 18:20:16 CEST 2015

Anyone else experienced a crash like this when doing a release upgrade?
ie, calling release_handler:install_release, with a valid relup

{"init terminating in

I've seen this a couple of times now (erlang 17.x) when upgrading
production systems under load, even with a trivial relup. No idea what that
pid was.

I think it might be a race in release_handler_1 where it calls
sys:get_status without a catch, when the process in question may have been
a supervision tree that had legitimately shut down since the list of pids
was fetched.



which calls get_proc_state, which does:

{status, _, {module, _}, [_, State, _, _, _]} = sys:get_status(Proc)

I've not managed to make a test for this yet, planning to spam lots of
terminate_childs to a busy supervisor while calling
release_handler_1:get_supervised_procs to try and reproduce.

If i'm right, it would only be triggered if parts of a supervision tree are
shutting down during a release_upgrade, which perhaps isn't very common,
depending on how dynamic the average supervision tree is in erlang apps.

Any feedback appreciated before I spend more time studying release handler
code :)

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20150824/fd3aa836/attachment.htm>

More information about the erlang-questions mailing list