[erlang-questions] msg mailbox and gen_svr shutdown

Loïc Hoguin essen@REDACTED
Tue May 15 18:40:06 CEST 2012


Okay, thanks for the details, I have a clearer view about it now. I 
think I can include something that could help you.

Today we have:

* start_listener
* stop_listener

What we could also have could be something like:

* suspend_listener
* resume_listener

What this basically would do is normally stop part of the listener's 
supervision tree, that is cowboy_acceptors_sup, or start it again when 
we want to resume.

Stopping that part would have the effect of killing the listening socket 
and the acceptors. Considering your gist you know that already. However 
you are also killing cowboy_listener, which shouldn't really considering 
protocols can call it (but that's more a consideration for me, it 
probably isn't in your case).

This all fits properly with the patch I'm working on for ranch (split 
off acceptor code that Cowboy will use past 0.6). On restarting this 
supervisor, a new listening socket gets opened, the acceptors query the 
listener gen_server for up to date protocol options and everything runs 
as before (with my patch, anyway).

Done like this I can certainly include it, being able to suspend and 
resume listening without killing currently running connections could 
definitely be useful for many things (the key point being to be able to 
resume). Especially when manipulating a release in production when 
you're behind a lb.

Gracefully stopping connection processes, either custom protocols or 
HTTP handlers, falls into the application's side however.

So this interests me eventually, but only after the Ranch switch + 
changes currently cooking because otherwise you could resume to an older 
version of the protocol options.

Thanks for the suggestion!

On 05/15/2012 05:20 PM, Bob Ippolito wrote:
> This isn't acceptable for me, because graceful shutdown can take
> minutes. Not listening for connections is ok for me because the clients
> connect to multiple servers and choose whichever seems fastest. This
> method of handoff reduces the downtime for a given machine to msec.
> Having a machine down for minutes would still work, but it's not
> desirable to have rolling deploys take hours instead of minutes. Any
> solution that requires hot code loading is also not going to work when
> you upgrade the VM.
>
> The only way to do better is to be able to pass sockets from one OS
> process to another (new NIFs), or open a hole to allow the app to modify
> pf settings. For me, it's sufficient that if a socket is accepted then
> it'll be handed appropriately and in a timely manner, so I prefer the
> current (cross-platform and pure erlang) solution.
>
> On Tuesday, May 15, 2012, Loïc Hoguin wrote:
>
>     I personally wouldn't want this to become common, because even if
>     you restart immediately, you would lose a few incoming connections
>     (or a lot depending on the popularity of your service). If I
>     understand correctly you go as far as restarting the whole node?
>
>     This isn't like nginx, since nginx doesn't stop listening at any
>     point. You could actually get something closer to nginx by doing this:
>
>     * set acceptors to 0 (incoming connections get queued)
>     * gracefully shutdown the running connections (application dependent)
>     * load the new code
>     * optionally restart parts of your supervision tree
>     * set acceptors back to N
>
>     You can add an optional set_protocol_options somewhere in there if
>     you need to upgrade the dispatch list (maybe after loading the new
>     code you could always run
>     set_protocol_options(my___project:get_dispatch_stuff) or something,
>     which would always get the last options for your project).
>
>     The number of incoming connections queued is also application
>     dependent (and will be upgradable later on so you can increase
>     temporarily).
>
>     This would allow you to do what you want without any extra feature
>     from what is currently planned, and without losing any incoming
>     connections.
>
>     All this is easily scriptable of course.
>
>     If that still doesn't fit your needs I'd suggest opening a ticket to
>     see what others think about it.
>
>     On 05/15/2012 07:35 AM, Bob Ippolito wrote:
>
>         The idea is to start a new version of the service and do a quick
>         hand-off of listening from old to new (possibly even switching
>         back).
>         This is particularly useful for code upgrades that aren't
>         suitable for
>         hot code loading. We're doing continuous integration this way,
>         as hot
>         code loading requires too much work and/or planning for us at this
>         stage. It's currently an uncommon need but it would be a lot
>         more common
>         if it was well documented and easier to do.
>
>         As far as load balancers go, we're not using them. For our use
>         case we
>         don't need them, so why add the complexity and/or cost? There is
>         port
>         forwarding done by the OS kernel but I want to avoid requiring some
>         setuid binary to do the privilege escalation that would be
>         necessary to
>         change ports on the fly.
>
>         Nginx does upgrades in a similar way.
>         http://wiki.nginx.org/__CommandLine#Upgrading_To_a___New_Binary_On_The_Fly
>         <http://wiki.nginx.org/CommandLine#Upgrading_To_a_New_Binary_On_The_Fly>
>
>         On Monday, May 14, 2012, Loïc Hoguin wrote:
>
>             As with all proposed features I first need to understand
>         what it's
>             used for and why, and whether it'll be useful to other
>         people. What
>             you are doing just sounds very weird to me and it's the
>         first time I
>             hear of a need for something like this so at this point I'm
>         neither
>             trying to avoid anything nor considering it useful
>         functionality,
>             just wanting to understand why.
>
>             On 05/15/2012 06:42 AM, Bob Ippolito wrote:
>
>                 Yes, I want the other process to immediately take over.
>                 Requiring a load
>                 balancer change isn't an elegant solution, I don't
>         understand
>                 why you're
>                 trying to avoid adding this useful functionality.
>
>                 On Monday, May 14, 2012, Loïc Hoguin wrote:
>
>                     You stop listening after you're done gracefully
>         stopping your
>                     currently running processes.
>
>                     For HTTP that could be something like:
>
>                     Set acceptors to 0.
>                     Optionally set 'onrequest' hook to reply with a 503 (for
>                 keepalives).
>                     Gracefully stop your processes.
>                     Stop the listener.
>
>                     Or do you need another process to take over immediately?
>                 Because in
>                     that case you usually don't need to use the same
>         listening
>                 port, you
>                     can just change your firewall/lb rules for the port
>                 redirection from
>                     80->P1 to 80->P2.
>
>                     On 05/15/2012 03:34 AM, Bob Ippolito wrote:
>
>                         I don't think that is sufficient, you will need
>         to stop
>                 listening as
>                         well so the next OS process can take over.
>
>                         On Monday, May 14, 2012, Loïc Hoguin wrote:
>
>                             This will be possible later on by reducing
>         the number of
>                         acceptors
>                             to 0. This should be added sometimes this summer
>                 after the
>                         acceptor
>                             split happens in Cowboy.
>
>                             On 05/14/2012 07:26 PM, Bob Ippolito wrote:
>
>                                 I agree that graceful shutdown is very
>                 application-specific.
>                                 However,
>                                 cowboy doesn't currently facilitate any
>         sort of
>                 graceful
>                                 shutdown unless
>                                 you read the source code and poke
>         directly at the
>                         appropriate
>                                 supervisors like I did. The application
>         specific
>                 stuff
>                         is easily
>                                 done on
>                                 your own with a process registry or a
>         timeout,
>                 we used a
>                                 combination of
>                                 gproc and a timeout if things didn't
>         shut down in an
>                         acceptable
>                                 time frame.
>
>                                 I would suggest something like
>                 cowboy:stop_listener/1, maybe
>                                 something
>                                 like cowboy:stop_listening/1 or
>                 cowboy:stop_accepting/1.
>
>                                 On Sun, May 13, 2012 at 11:02 PM, Loïc
>         Hoguin
>         <essen@REDACTED
>         <mailto:essen@REDACTED>> wrote:
>
>                                     Right, this is pretty much what I
>         said to
>                 Paul in
>                         PM. It was in
>                                     R14B03 that the behavior changed. I
>                 apparently have
>                         a @todo
>                                 wrong
>                                     past that release, and will take a
>         look if I
>                 find other
>                                 things to
>         https://gist.github.com/__________2655724
>         <https://gist.github.com/________2655724>
>         <https://gist.github.com/________2655724
>         <https://gist.github.com/______2655724>>
>         <https://gist.github.com/________2655724
>         <https://gist.github.com/______2655724>
>         <https://gist.github.com/______2655724
>         <https://gist.github.com/____2655724>>>
>         <https://gist.github.com/________2655724
>         <https://gist.github.com/______2655724>
>         <https://gist.github.com/______2655724
>         <https://gist.github.com/____2655724>>
>         <https://gist.github.com/______2655724
>         <https://gist.github.com/____2655724>
>         <https://gist.github.com/____2655724
>         <https://gist.github.com/__2655724>>>>
>         <https://gist.github.com/________2655724
>         <https://gist.github.com/______2655724>
>         <https://gist.github.com/______2655724
>         <https://gist.github.com/____2655724>>
>         <https://gist.github.com/______2655724
>         <https://gist.github.com/____2655724>
>         <https://gist.github.com/____2655724
>         <https://gist.github.com/__2655724>>>
>         <https://gist.github.com/______2655724
>         <https://gist.github.com/____2655724>
>         <https://gist.github.com/____2655724
>         <https://gist.github.com/__2655724>>
>         <https://gist.github.com/____2655724
>         <https://gist.github.com/__2655724>
>         <https://gist.github.com/__2655724
>         <https://gist.github.com/2655724>>>>>
>
>                                                 On Thu, May 10, 2012 at
>         2:53 AM,
>                 Paweł
>                         Peregud
>         <paulperegud@REDACTED <mailto:paulperegud@REDACTED>
>         <javascript:_e({}, 'cvml',
>         'paulperegud@REDACTED
>         <mailto:paulperegud@REDACTED>');>> wrote:
>
>                                                 I was having fun with
>                 supervisors yesterday
>                                 (Cowboy seems to
>                                                 fail to fulfill the
>         promise of
>                 not killing
>                                 request processes
>                                                 after listener removal)
>         and I
>                 have an
>                         example.
>                                 I've only
>                                                 investigated the case when
>                 supervisor is
>                         killed,
>                                 so YMMV.
>                                                 Example code is
>         attached. You
>                 may modify
>                         it to
>                                 check the
>                                                 behavior in your case.
>
>                                                 Start supervisor tree with
>                                 exp_sup_sup:start_link(). Execute
>                                                 test with
>         exp_sup_sup:test() and
>                                 exp_sup_sup:test_simple().
>
>                                                 In case of dying
>         supervisor the
>                 answer
>                         is "no,
>                                 it does not".
>
>                                                 When supervisor dies, your
>                 process is
>                         killed as
>                                 via link
>                                                 mechanism, so it may
>         leave some
>                 unprocessed
>                                 messages in
>                                                 inbox. To make sure that
>         every
>                 delivered
>                         message
>                                 is served,
>                                                 you need to add
>                 process_flag(trap_exit,
>                         true).
>                                 Messages that
>                                                 are sent after the
>         moment when
>                         supervisor dies
>                                 are not
>                                                 processed.
>
>                                                 Best regards,
>
>                                                 Paul.
>
>
>                                                 On May 9, 2012 11:06 AM,
>         "Andy
>                 Richards"
>         <andy.richards.iit@REDACTED
>         <mailto:andy.richards.iit@REDACTED>
>         <javascript:_e({}, 'cvml',
>         'andy.richards.iit@REDACTED
>         <mailto:andy.richards.iit@REDACTED>');>> wrote:
>
>                                                 Hi,
>
>                                                 I can't seem to see any
>                 confirmation in the
>                                 documentation
>                                                 so was wondering if
>         anyone could
>                 confirm if
>                                 messages are
>                                                 still sent to a
>         supervised gen_svr
>                         following a
>                                 shutdown
>                                                 message?
>
>                                                 If so how do I cleanly
>         shutdown my
>                         gen_svr without
>                                                 loosing messages? I read
>         in the
>                         supervisor child
>                                 spec
>                                                 that a shutdown can be
>         set to
>                 infinity
>                         which i hoped
>                                                 would allow me to
>         process the
>                 msg's in my
>                                 mailbox but if
>                                                 I do this will my module
>                 continue to receive
>                                 messages
>                                                 from other processes? Is
>         my approach
>                         flawed and
>                                 if so
>                                                 what other ways are
>         there to cleanly
>                         shutting
>                                 down my
>                                                 gen_svr without loosing
>         messages?
>
>                                                 Many thanks,
>
>                                                 Andy.
>
>
>           _________________________________________________________
>                                                 erlang-questions mailing
>         list
>         erlang-questions@REDACTED
>         <mailto:erlang-questions@REDACTED> <javascript:_e({},
>         'cvml',
>         'erlang-questions@REDACTED
>         <mailto:erlang-questions@REDACTED>')__;>
>         http://erlang.org/mailman/__________listinfo/erlang-questions
>         <http://erlang.org/mailman/________listinfo/erlang-questions>
>         <http://erlang.org/mailman/________listinfo/erlang-questions
>         <http://erlang.org/mailman/______listinfo/erlang-questions>>
>         <http://erlang.org/mailman/________listinfo/erlang-questions
>         <http://erlang.org/mailman/______listinfo/erlang-questions>
>         <http://erlang.org/mailman/______listinfo/erlang-questions
>         <http://erlang.org/mailman/____listinfo/erlang-questions>>>
>         <http://erlang.org/mailman/________listinfo/erlang-questions
>         <http://erlang.org/mailman/______listinfo/erlang-questions>
>         <http://erlang.org/mailman/______listinfo/erlang-questions
>         <http://erlang.org/mailman/____listinfo/erlang-questions>>
>         <http://erlang.org/mailman/______listinfo/erlang-questions
>         <http://erlang.org/mailman/____listinfo/erlang-questions>
>         <http://erlang.org/mailman/____listinfo/erlang-questions
>         <http://erlang.org/mailman/__listinfo/erlang-questions>>>>
>         <http://erlang.org/mailman/________listinfo/erlang-questions
>         <http://erlang.org/mailman/______listinfo/erlang-questions>
>         <http://erlang.org/mailman/______listinfo/erlang-questions
>         <http://erlang.org/mailman/____listinfo/erlang-questions>>
>         <http://erlang.org/mailman/______listinfo/erlang-questions
>         <http://erlang.org/mailman/____listinfo/erlang-questions>
>         <http://erlang.org/mailman/____listinfo/erlang-questions
>         <http://erlang.org/mailman/__listinfo/erlang-questions>>>
>         <http://erlang.org/mailman/______listinfo/erlang-questions
>         <http://erlang.org/mailman/____listinfo/erlang-questions>
>         <http://erlang.org/mailman/____listinfo/erlang-questions
>         <http://erlang.org/mailman/__listinfo/erlang-questions>>
>         <http://erlang.org/mailman/____listinfo/erlang-questions
>         <http://erlang.org/mailman/__listinfo/erlang-questions>
>         <http://erlang.org/mailman/__listinfo/erlang-questions
>         <http://erlang.org/mailman/listinfo/erlang-questions>>>>>
>
>
>
>
>           _________________________________________________________
>                                                 erlang-questions mailing
>         list
>         erlang-questions@REDACTED
>         <mailto:erlang-questions@REDACTED> <javascript:_e({},
>         'cvml',
>         'erlang-questions@REDACTED
>         <mailto:erlang-questions@REDACTED>')__;>
>         http://erlang.org/mailman/__________listinfo/erlang-questions
>         <http://erlang.org/mailman/________listinfo/erlang-questions>
>         <http://erlang.org/mailman/________listinfo/erlang-questions
>         <http://erlang.org/mailman/______listinfo/erlang-questions>>
>         <http://erlang.org/mailman/________listinfo/erlang-questions
>         <http://erlang.org/mailman/______listinfo/erlang-questions>
>         <http://erlang.org/mailman/______listinfo/erlang-questions
>         <http://erlang.org/mailman/____listinfo/erlang-questions>>>
>         <http://erlang.org/mailman/________listinfo/erlang-questions
>         <http://erlang.org/mailman/______listinfo/erlang-questions>
>         <http://erlang.org/mailman/______listinfo/erlang-questions
>         <http://erlang.org/mailman/____listinfo/erlang-questions>>
>         <http://erlang.org/mailman/______listinfo/erlang-questions
>         <http://erlang.org/mailman/____listinfo/erlang-questions>
>         <http://erlang.org/mailman/____listinfo/erlang-questions
>         <http://erlang.org/mailman/__listinfo/erlang-questions>>>>
>         <http://erlang.org/mailman/________listinfo/erlang-questions
>         <http://erlang.org/mailman/______listinfo/erlang-questions>
>         <http://erlang.org/mailman/______listinfo/erlang-questions
>         <http://erlang.org/mailman/____listinfo/erlang-questions>>
>         <http://erlang.org/mailman/______listinfo/erlang-questions
>         <http://erlang.org/mailman/____listinfo/erlang-questions>
>         <http://erlang.org/mailman/____listinfo/erlang-questions
>         <http://erlang.org/mailman/__listinfo/erlang-questions>>>
>         <http://erlang.org/mailman/______listinfo/erlang-questions
>         <http://erlang.org/mailman/____listinfo/erlang-questions>
>         <http://erlang.org/mailman/____listinfo/erlang-questions
>         <http://erlang.org/mailman/__listinfo/erlang-questions>>
>         <http://erlang.org/mailman/____listinfo/erlang-questions
>         <http://erlang.org/mailman/__listinfo/erlang-questions>
>         <http://erlang.org/mailman/__listinfo/erlang-questions
>         <http://erlang.org/mailman/listinfo/erlang-questions>>>>>
>
>
>
>
>           _________________________________________________________
>                                                 erlang-questions mailing
>         list
>         erlang-questions@REDACTED
>         <mailto:erlang-questions@REDACTED> <javasc
>


-- 
Loïc Hoguin
Erlang Cowboy
Nine Nines



More information about the erlang-questions mailing list