[erlang-patches] release_handler_1 improvements
Siri Hansen
erlangsiri@REDACTED
Mon Aug 29 16:31:16 CEST 2011
Hi Joe -
I think I would prefer solution 1), although that's probably mostly because
I don't really understand solution 2)... What do you mean by "stop the
upgrade from completing"? in which state would the node be after this?
/siri
2011/8/26 Joe Williams <joe@REDACTED>
> Siri,
>
> That sounds correct, with the current patch there is that risk. In my case
> I would see the error message post-upgrade and restart things as needed but
> I certainly see your point. The VM restarting is a brutal but idiomatic way
> to deal with this issue, let it fail :).
>
> I think there are two possibilities here, 1) continue with the restart
> behavior but make sure we print error messages before we do or 2) print
> error messages but stop the upgrade from completing if we catch the bad
> case. Thoughts?
>
> -Joe
>
> --
> Name: Joseph A. Williams
> Email: joe@REDACTED
> Blog: http://www.joeandmotorboat.com/
> Twitter: http://twitter.com/williamsjoe
>
> On Friday, August 26, 2011 at 1:08 AM, Siri Hansen wrote:
>
> Hi again, Joe!
>
> Sorry for being so slow - but I still don't really understand :(
> My concerns are really about whether or not we should allow the upgrade to
> be performed in this case. For sure I think we should
>
> 1) avoid the timeout, and
> 2) let the user know what the problem is
>
> but is it correct to let the upgrade pass after this? Is it not an error
> situation?
>
> It seems to me that we risk getting into a situation where we believe that
> the system is upgraded, but in fact there could be branches of the
> supervisor tree where process have not had the chance to run their
> code_change functions. I mean - even if we print the error report, there is
> no guarantee that it is really detected unless the operation actually fails.
>
> Please correct me if I completely misunderstood the situation.
>
> Regards
> /siri
>
>
> 2011/8/25 Joe Williams <joe@REDACTED>
>
> Siri,
>
> I ran into two issues that this patch addresses. Check out the commit
> message at
> https://github.com/joewilliams/otp/commit/9c3a53789326cdd929f1c3b4525716b1c0abfe87 for
> the details. In both cases I found that in production an error in the logs
> was preferable to the restart of the VM since both are easily fixable with a
> small application change or in the case of the suspended supervisor using a
> different app up. Also see this comment in release_handler_1 regarding the
> supervisor,
> https://github.com/erlang/otp/blob/dev/lib/sasl/src/release_handler_1.erl#L454 which
> suggests this corner case is known by at least a few people. Currently there
> is no way to know *why* your VM just restarted after the upgrade in either
> case.
>
> Let me know if you have any other questions.
>
> -Joe
>
>
> --
> Name: Joseph A. Williams
> Email: joe@REDACTED
> Blog: http://www.joeandmotorboat.com/
> Twitter: http://twitter.com/williamsjoe
>
> On Thursday, August 25, 2011 at 6:35 AM, Siri Hansen wrote:
>
> Hi again, Joe!
>
> Could you please explain a bit about the situation where you discovered
> this problem? I agree that the timeout and VM restart is not very good, and
> it makes sense to check if the supervisor is suspended. But I'm not really
> sure if it is correct to allow the upgrade to continue when this error
> occurs. Even if an error message is printed, I guess it could be quite easy
> to miss this fact... and the question is if that would be a problem or not?
> Why is the supervisor suspended in the first place?
>
> Regards
> /siri
>
>
> 2011/8/25 Siri Hansen <erlangsiri@REDACTED>
>
> Hi Joe - I've just started looking at this. Do you think it would be
> possible to add a test case for it?
> Regards
> /siri
>
>
> 2011/8/24 Joe Williams <joe@REDACTED>
>
> Anything I can do regarding this patch? I have happily been running it in
> production since I submitted it to the list in June.
>
> -Joe
>
>
> --
> Name: Joseph A. Williams
> Email: joe@REDACTED
> Blog: http://www.joeandmotorboat.com/
> Twitter: http://twitter.com/williamsjoe
>
> On Wednesday, July 6, 2011 at 3:43 PM, Joe Williams wrote:
>
> Anything I can do to help this patch graduate?
>
> Thanks!
>
> -Joe
>
>
> --
> Name: Joseph A. Williams
> Email: joe@REDACTED
> Blog: http://www.joeandmotorboat.com/
> Twitter: http://twitter.com/williamsjoe
>
> On Tuesday, June 14, 2011 at 12:26 PM, Joe Williams wrote:
>
> Updated this branch, please refetch.
>
> git fetch git://github.com/joewilliams/otp.git release_handler_1
>
>
> --
> Name: Joseph A. Williams
> Email: joe@REDACTED
> Blog: http://www.joeandmotorboat.com/
> Twitter: http://twitter.com/williamsjoe
>
> On Friday, June 10, 2011 at 8:52 AM, Joe Williams wrote:
>
> Great, thanks!
>
>
> --
> Name: Joseph A. Williams
> Email: joe@REDACTED
> Blog: http://www.joeandmotorboat.com/
> Twitter: http://twitter.com/williamsjoe
>
> On Friday, June 10, 2011 at 8:51 AM, Raimo Niskanen wrote:
>
> On Thu, Jun 09, 2011 at 08:20:51AM -0700, Joe Williams wrote:
>
> Please fetch:
>
> git fetch git://github.com/joewilliams/otp.git release_handler_1
>
> This is a different branch with a better commit message and no white space
> changes.
>
>
> Excellent. I will include your patch in 'pu' after rewording the
> summary line to imperative form.
>
>
>
>
> --
> Name: Joseph A. Williams
> Email: joe@REDACTED
> Blog: http://www.joeandmotorboat.com/
> Twitter: http://twitter.com/williamsjoe
>
>
> On Thursday, June 9, 2011 at 7:44 AM, Joe Williams wrote:
>
> Nothing specific, just wondered if anyone had any thoughts on how I dealt
> with a couple of corner cases in installing releases.
>
> I'll fix things up and get back shortly.
>
> --
> Name: Joseph A. Williams
> Email: joe@REDACTED (mailto:joe@REDACTED <joe@REDACTED>)
> Blog: http://www.joeandmotorboat.com/
> Twitter: http://twitter.com/williamsjoe
>
>
> On Thursday, June 9, 2011 at 12:11 AM, Raimo Niskanen wrote:
>
> On Wed, Jun 08, 2011 at 03:41:37PM -0700, Joe Williams wrote:
>
> Any thoughts/feedback on this patch? I realize it doesn't follow the
> guidelines (https://github.com/erlang/otp/wiki/Submitting-patches) exactly
> and will clean it up soon.
>
>
> Anything in particular? I just got caught up in tideous merge work
> yesterday and missed to include your patch in 'pu', I was about
> to take it now.
>
> But if you have a cleanup I can wait for it...
>
>
>
> --
> Name: Joseph A. Williams
> Email: joe@REDACTED (mailto:joe@REDACTED <joe@REDACTED>)
> Blog: http://www.joeandmotorboat.com/
> Twitter: http://twitter.com/williamsjoe
>
>
> On Tuesday, June 7, 2011 at 2:33 PM, Joe Williams wrote:
>
> git fetch git://github.com/joewilliams/otp.git (
> http://github.com/joewilliams/otp.git) (
> http://github.com/joewilliams/otp.git) release_handler
>
>
> _______________________________________________
> erlang-patches mailing list
> erlang-patches@REDACTED (mailto:erlang-patches@REDACTED<erlang-patches@REDACTED>
> )
> http://erlang.org/mailman/listinfo/erlang-patches
>
>
>
> --
>
> / Raimo Niskanen, Erlang/OTP, Ericsson AB
>
>
> _______________________________________________
> erlang-patches mailing list
> erlang-patches@REDACTED (mailto:erlang-patches@REDACTED<erlang-patches@REDACTED>
> )
> http://erlang.org/mailman/listinfo/erlang-patches
>
>
> --
>
> / Raimo Niskanen, Erlang/OTP, Ericsson AB
>
>
>
>
>
>
> _______________________________________________
> erlang-patches mailing list
> erlang-patches@REDACTED
> http://erlang.org/mailman/listinfo/erlang-patches
>
>
>
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-patches/attachments/20110829/2813703a/attachment.htm>
More information about the erlang-patches
mailing list