[erlang-questions] release handler crash during relup, sys:get_status noproc

Richard Jones rj@REDACTED
Mon Aug 24 23:29:08 CEST 2015


Hi,

It's not a distributed application.

Release handler is called from another script, as part of the deployment
process.

It runs "erl -name deployer@REDACTED -setcookie .. " and then does
rpc:call(TargetNode, release_handler, ....)


RJ


On 24 Aug 2015 21:55, "Éric Pailleau" <eric.pailleau@REDACTED> wrote:

> Hi,
>
> Pid not beginning with 0 is not a local Pid.
> Is your release upgradeing a distributed application ?
>
>
> Le 24 août 2015 18:20, Richard Jones <rj@REDACTED> a écrit :
> >
> > Anyone else experienced a crash like this when doing a release upgrade?
> > ie, calling release_handler:install_release, with a valid relup
> >
> > {"init terminating in
> do_boot",{{badmatch,{error,{'EXIT',{noproc,{sys,get_status,[<6453.14610.13>]}}}}},[{erl_eval,expr,3,[]}]}}
> >
> > I've seen this a couple of times now (erlang 17.x) when upgrading
> production systems under load, even with a trivial relup. No idea what that
> pid was.
> >
> > I think it might be a race in release_handler_1 where it calls
> sys:get_status without a catch, when the process in question may have been
> a supervision tree that had legitimately shut down since the list of pids
> was fetched.
> >
> > ie:
> >
> >
> https://github.com/erlang/otp/blob/OTP-17.5.6.3/lib/sasl/src/release_handler_1.erl#L589
> >
> > which calls get_proc_state, which does:
> >
> > {status, _, {module, _}, [_, State, _, _, _]} = sys:get_status(Proc)
> >
> > I've not managed to make a test for this yet, planning to spam lots of
> terminate_childs to a busy supervisor while calling
> release_handler_1:get_supervised_procs to try and reproduce.
> >
> > If i'm right, it would only be triggered if parts of a supervision tree
> are shutting down during a release_upgrade, which perhaps isn't very
> common, depending on how dynamic the average supervision tree is in erlang
> apps.
> >
> > Any feedback appreciated before I spend more time studying release
> handler code :)
> >
> > RJ
> >
> >
> >
> >
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20150824/92f91f18/attachment.htm>


More information about the erlang-questions mailing list