[erlang-patches] release_handler_1 improvements

Joe Williams joe@REDACTED
Mon Aug 29 18:35:51 CEST 2011


Siri,

In case #2 the node would be in an "unpacked" state but perhaps that isn't possible since the upgrade may be partially installed already. I'll work on implementing #1 and reply back soon.

-Joe


-- 
Name: Joseph A. Williams
Email: joe@REDACTED
Blog: http://www.joeandmotorboat.com/
Twitter: http://twitter.com/williamsjoe


On Monday, August 29, 2011 at 7:31 AM, Siri Hansen wrote:

> Hi Joe - 
> I think I would prefer solution 1), although that's probably mostly because I don't really understand solution 2)... What do you mean by "stop the upgrade from completing"? in which state would the node be after this?
> /siri
> 
> 2011/8/26 Joe Williams <joe@REDACTED (mailto:joe@REDACTED)>
> >  Siri, 
> > 
> > That sounds correct, with the current patch there is that risk. In my case I would see the error message post-upgrade and restart things as needed but I certainly see your point. The VM restarting is a brutal but idiomatic way to deal with this issue, let it fail :). 
> > 
> > I think there are two possibilities here, 1) continue with the restart behavior but make sure we print error messages before we do or 2) print error messages but stop the upgrade from completing if we catch the bad case. Thoughts? 
> > 
> > -Joe
> > 
> > -- 
> > Name: Joseph A. Williams
> > Email: joe@REDACTED (mailto:joe@REDACTED)
> > Blog: http://www.joeandmotorboat.com/
> > Twitter: http://twitter.com/williamsjoe
> > 
> > 
> > On Friday, August 26, 2011 at 1:08 AM, Siri Hansen wrote:
> > 
> > > Hi again, Joe!
> > > 
> > > Sorry for being so slow - but I still don't really understand :(
> > > My concerns are really about whether or not we should allow the upgrade to be performed in this case. For sure I think we should 
> > > 
> > > 1) avoid the timeout, and
> > > 2) let the user know what the problem is
> > > 
> > > but is it correct to let the upgrade pass after this? Is it not an error situation?
> > > 
> > > It seems to me that we risk getting into a situation where we believe that the system is upgraded, but in fact there could be branches of the supervisor tree where process have not had the chance to run their code_change functions. I mean - even if we print the error report, there is no guarantee that it is really detected unless the operation actually fails. 
> > > 
> > > Please correct me if I completely misunderstood the situation.
> > > 
> > > Regards
> > > /siri
> > > 
> > > 
> > > 2011/8/25 Joe Williams <joe@REDACTED (mailto:joe@REDACTED)>
> > > >  Siri, 
> > > > 
> > > > I ran into two issues that this patch addresses. Check out the commit message at https://github.com/joewilliams/otp/commit/9c3a53789326cdd929f1c3b4525716b1c0abfe87 for the details. In both cases I found that in production an error in the logs was preferable to the restart of the VM since both are easily fixable with a small application change or in the case of the suspended supervisor using a different app up. Also see this comment in release_handler_1 regarding the supervisor, https://github.com/erlang/otp/blob/dev/lib/sasl/src/release_handler_1.erl#L454 which suggests this corner case is known by at least a few people. Currently there is no way to know *why* your VM just restarted after the upgrade in either case. 
> > > > 
> > > > Let me know if you have any other questions.
> > > > 
> > > > -Joe
> > > > 
> > > > 
> > > > -- 
> > > > Name: Joseph A. Williams
> > > >  Email: joe@REDACTED (mailto:joe@REDACTED)
> > > > Blog: http://www.joeandmotorboat.com/
> > > >  Twitter: http://twitter.com/williamsjoe
> > > > 
> > > > 
> > > > On Thursday, August 25, 2011 at 6:35 AM, Siri Hansen wrote:
> > > > 
> > > > > Hi again, Joe!
> > > > > 
> > > > > Could you please explain a bit about the situation where you discovered this problem? I agree that the timeout and VM restart is not very good, and it makes sense to check if the supervisor is suspended. But I'm not really sure if it is correct to allow the upgrade to continue when this error occurs. Even if an error message is printed, I guess it could be quite easy to miss this fact... and the question is if that would be a problem or not? Why is the supervisor suspended in the first place? 
> > > > > 
> > > > > Regards
> > > > > /siri
> > > > > 
> > > > > 
> > > > > 2011/8/25 Siri Hansen <erlangsiri@REDACTED (mailto:erlangsiri@REDACTED)>
> > > > > >  Hi Joe - I've just started looking at this. Do you think it would be possible to add a test case for it?
> > > > > > Regards
> > > > > > /siri
> > > > > > 
> > > > > > 
> > > > > > 2011/8/24 Joe Williams <joe@REDACTED (mailto:joe@REDACTED)>
> > > > > > >  Anything I can do regarding this patch? I have happily been running it in production since I submitted it to the list in June. 
> > > > > > > 
> > > > > > > -Joe
> > > > > > > 
> > > > > > > 
> > > > > > > -- 
> > > > > > > Name: Joseph A. Williams
> > > > > > > Email: joe@REDACTED (mailto:joe@REDACTED)
> > > > > > >  Blog: http://www.joeandmotorboat.com/
> > > > > > > Twitter: http://twitter.com/williamsjoe
> > > > > > > 
> > > > > > > 
> > > > > > > On Wednesday, July 6, 2011 at 3:43 PM, Joe Williams wrote:
> > > > > > > 
> > > > > > > > Anything I can do to help this patch graduate?
> > > > > > > > 
> > > > > > > > Thanks!
> > > > > > > > 
> > > > > > > > -Joe
> > > > > > > > 
> > > > > > > > 
> > > > > > > > -- 
> > > > > > > > Name: Joseph A. Williams
> > > > > > > > Email: joe@REDACTED (mailto:joe@REDACTED)
> > > > > > > >  Blog: http://www.joeandmotorboat.com/
> > > > > > > > Twitter: http://twitter.com/williamsjoe
> > > > > > > > 
> > > > > > > > 
> > > > > > > > On Tuesday, June 14, 2011 at 12:26 PM, Joe Williams wrote:
> > > > > > > > 
> > > > > > > > > Updated this branch, please refetch.
> > > > > > > > > 
> > > > > > > > > git fetch git://github.com/joewilliams/otp.git (http://github.com/joewilliams/otp.git) release_handler_1 
> > > > > > > > > 
> > > > > > > > > 
> > > > > > > > > -- 
> > > > > > > > > Name: Joseph A. Williams
> > > > > > > > > Email: joe@REDACTED (mailto:joe@REDACTED)
> > > > > > > > >  Blog: http://www.joeandmotorboat.com/
> > > > > > > > > Twitter: http://twitter.com/williamsjoe
> > > > > > > > > 
> > > > > > > > > 
> > > > > > > > > On Friday, June 10, 2011 at 8:52 AM, Joe Williams wrote:
> > > > > > > > > 
> > > > > > > > > > Great, thanks!
> > > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > > -- 
> > > > > > > > > > Name: Joseph A. Williams
> > > > > > > > > > Email: joe@REDACTED (mailto:joe@REDACTED)
> > > > > > > > > >  Blog: http://www.joeandmotorboat.com/
> > > > > > > > > > Twitter: http://twitter.com/williamsjoe
> > > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > > On Friday, June 10, 2011 at 8:51 AM, Raimo Niskanen wrote:
> > > > > > > > > > 
> > > > > > > > > > > On Thu, Jun 09, 2011 at 08:20:51AM -0700, Joe Williams wrote:
> > > > > > > > > > > > Please fetch:
> > > > > > > > > > > > 
> > > > > > > > > > > > git fetch git://github.com/joewilliams/otp.git (http://github.com/joewilliams/otp.git) release_handler_1
> > > > > > > > > > > > 
> > > > > > > > > > > > This is a different branch with a better commit message and no white space changes.
> > > > > > > > > > > 
> > > > > > > > > > > Excellent. I will include your patch in 'pu' after rewording the
> > > > > > > > > > > summary line to imperative form.
> > > > > > > > > > > 
> > > > > > > > > > > > 
> > > > > > > > > > > > 
> > > > > > > > > > > > 
> > > > > > > > > > > > -- 
> > > > > > > > > > > > Name: Joseph A. Williams
> > > > > > > > > > > > Email: joe@REDACTED (mailto:joe@REDACTED)
> > > > > > > > > > > > Blog: http://www.joeandmotorboat.com/
> > > > > > > > > > > >  Twitter: http://twitter.com/williamsjoe
> > > > > > > > > > > > 
> > > > > > > > > > > > 
> > > > > > > > > > > > On Thursday, June 9, 2011 at 7:44 AM, Joe Williams wrote:
> > > > > > > > > > > > 
> > > > > > > > > > > > > Nothing specific, just wondered if anyone had any thoughts on how I dealt with a couple of corner cases in installing releases.
> > > > > > > > > > > > > 
> > > > > > > > > > > > > I'll fix things up and get back shortly.
> > > > > > > > > > > > > 
> > > > > > > > > > > > > -- 
> > > > > > > > > > > > > Name: Joseph A. Williams
> > > > > > > > > > > > > Email: joe@REDACTED (mailto:joe@REDACTED) (mailto:joe@REDACTED)
> > > > > > > > > > > > >  Blog: http://www.joeandmotorboat.com/
> > > > > > > > > > > > > Twitter: http://twitter.com/williamsjoe
> > > > > > > > > > > > > 
> > > > > > > > > > > > > 
> > > > > > > > > > > > > On Thursday, June 9, 2011 at 12:11 AM, Raimo Niskanen wrote:
> > > > > > > > > > > > > 
> > > > > > > > > > > > > > On Wed, Jun 08, 2011 at 03:41:37PM -0700, Joe Williams wrote:
> > > > > > > > > > > > > > > Any thoughts/feedback on this patch? I realize it doesn't follow the guidelines (https://github.com/erlang/otp/wiki/Submitting-patches) exactly and will clean it up soon.
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > Anything in particular? I just got caught up in tideous merge work
> > > > > > > > > > > > > > yesterday and missed to include your patch in 'pu', I was about
> > > > > > > > > > > > > > to take it now.
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > But if you have a cleanup I can wait for it...
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > -- 
> > > > > > > > > > > > > > > Name: Joseph A. Williams
> > > > > > > > > > > > > > > Email: joe@REDACTED (mailto:joe@REDACTED) (mailto:joe@REDACTED)
> > > > > > > > > > > > > > >  Blog: http://www.joeandmotorboat.com/
> > > > > > > > > > > > > > > Twitter: http://twitter.com/williamsjoe
> > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > On Tuesday, June 7, 2011 at 2:33 PM, Joe Williams wrote:
> > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > >  git fetch git://github.com/joewilliams/otp.git (http://github.com/joewilliams/otp.git) (http://github.com/joewilliams/otp.git) (http://github.com/joewilliams/otp.git) release_handler
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > _______________________________________________
> > > > > > > > > > > > > > > erlang-patches mailing list
> > > > > > > > > > > > > > > erlang-patches@REDACTED (mailto:erlang-patches@REDACTED) (mailto:erlang-patches@REDACTED)
> > > > > > > > > > > > > > > http://erlang.org/mailman/listinfo/erlang-patches
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > -- 
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > / Raimo Niskanen, Erlang/OTP, Ericsson AB
> > > > > > > > > > > > > 
> > > > > > > > > > > > > _______________________________________________
> > > > > > > > > > > > > erlang-patches mailing list
> > > > > > > > > > > > > erlang-patches@REDACTED (mailto:erlang-patches@REDACTED) (mailto:erlang-patches@REDACTED)
> > > > > > > > > > > > > http://erlang.org/mailman/listinfo/erlang-patches
> > > > > > > > > > > 
> > > > > > > > > > > -- 
> > > > > > > > > > > 
> > > > > > > > > > > / Raimo Niskanen, Erlang/OTP, Ericsson AB
> > > > > > > > > > 
> > > > > > > > > 
> > > > > > > > 
> > > > > > > 
> > > > > > > 
> > > > > > > _______________________________________________
> > > > > > >  erlang-patches mailing list
> > > > > > > erlang-patches@REDACTED (mailto:erlang-patches@REDACTED)
> > > > > > > http://erlang.org/mailman/listinfo/erlang-patches
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-patches/attachments/20110829/3adb5619/attachment.htm>


More information about the erlang-patches mailing list