[erlang-patches] Supervisor shutdown reason when reaching max restarts

Siri Hansen erlangsiri@REDACTED
Mon May 12 14:17:13 CEST 2014


Hi Tobias!

It seems this feature would be "good to have" but is not totally necessary.
Since there has been very little feeback on this thread, and given the
backwards incompatibility, we have thus decided to reject it.

Thanks for you effort!
Best Regards
/siri


2014-04-16 15:49 GMT+02:00 Tobias Schlager <Tobias.Schlager@REDACTED>:

>  Hi Siri,
>
> I already solved the problem for my use case. In my case the supervisors
> of type C never exit gracefully except for the case the application gets
> stopped, so I ended up 'disarming' the monitors of process B in an
> 'application:prep_stop/1' callback.
>
> Of course, this is a debatable patch because it introduces backwards
> incompatible behavior and I will not argue if it gets declined. I just
> thought that having differentiated exit reason of supervisors would be a
> good thing to have in general (and that according to the documentation
> {shutdown, Reason} exits must be supported by callback implementations
> anyways).
>
> Regards
> Tobias
>
>  --
>
> Lindenbaum GmbH
> Dipl.-Inform. Tobias Schlager, Software Engineer
> Erbprinzenstr. 4-12, 76133 Karlsruhe, Deutschland
> Tel: +49 721 48 08 48 - 000, Fax: +49 721 48 08 48 - 801,
> E-Mail: tobias.schlager@REDACTED
>
> Firmensitz: Erbprinzenstr. 4-12, Eingang A, 76133 Karlsruhe
> Registergericht: Amtsgericht Mannheim, HRB 706184
> Geschäftsführer: Dr. Ralf Nikolai
> Steuernummer: 35007/02060, USt. ID: DE 263797265
>
> http://www.lindenbaum.eu
>
>
>   ------------------------------
> *Von:* Siri Hansen [erlangsiri@REDACTED]
> *Gesendet:* Mittwoch, 16. April 2014 14:54
>
> *An:* Tobias Schlager
> *Cc:* erlang-patches@REDACTED
> *Betreff:* Re: [erlang-patches] Supervisor shutdown reason when reaching
> max restarts
>
>   Hello Tobias and the list!
>
>  This review has finally made its way up the backlog. Thanks for your
> patience!
>
>  Is this use case still relevant?
>
>  I've started looking at the compatibility issue. As far as I can see
> there shouldn't be a big problem upwards in a normal supervision tree,
> since this is internal to OTP. Downwards, however, there might be problems
> since exit trapping children will get {shutdown,Reason} instead of shutdown
> when a supervisor terminates due to a failed restart attempt. In the case
> of a gen_server or gen_fsm, this exit reason will be propagated to the
> callback module via Mod:terminate/2. The value 'shutdown', used when the
> supervisor orders termination, is documented for this callback function.
> For other proc_lib processes the receive statement is implemented
> individually, so even here the change will be exposed to the user (if
> matching the exit reason).
>
>  Links and monitors across the supervision tree might also be a problem,
> of course. And the case when something other than a supervisor or the
> application_master is the parent of a supervisor - I don't know if this is
> very common, though, and I find the incompatibility downwards in the
> supervision tree most alarming.
>
>  As you know, we need to be very careful with backwards incompatible
> changes in core functionality, so for the time being we would like to
> investigate the general need for this functionality and if there are other
> possible ways to go.
>
>  Some questions for the list:
>
>  1. Is the use case described in the Tobias' previous mail a common one?
> If yes, how have you solved it?
>
>  2. Do you normally care about the exit reason in your Mod:terminate/2
> function or in your proc_lib receive statement when getting an EXIT from
> the parent?
>
>  3. Any other comments on this?
>
>  Thanks
> /siri
>
>
>
> 2013-08-23 14:17 GMT+02:00 Siri Hansen <erlangsiri@REDACTED>:
>
>> Tobias, thanks for a good explanation. I will post this into the ticket
>> and we'll take it into account when finishing the review.
>> Thanks again for your contribution!
>>  /siri
>>
>>
>> 2013/8/23 Tobias Schlager <Tobias.Schlager@REDACTED>
>>
>>>  Hi Siri,
>>>
>>> glad to hear from you, I'll try to do my best to explain the use case I
>>> have in mind.
>>>
>>> Consider you have a one_for_one (or simple_one_for_one) supervisor A
>>> with a worker child B that dynamically adds children to A (using
>>> supervisor:start_child/2). Now consider these children also are supervisors
>>> of type C with various statically configured workers. I now would like to
>>> monitor supervisors of type C from worker B to be able to take some action
>>> when *something goes wrong* at one of the C supervisors (e.g. C crashed
>>> because of one of its subworkers). However, I can't differentiate between
>>> 'something went wrong' or a supervisor C just exited gracefully (e.g. the
>>> application was stopped) because supervisors only exit with reason normal
>>> or shutdown. It is arguable whether to use another restart type for C
>>> supervisors in order to propagate the exit. However, I don't want to crash
>>> the whole supervision just to be able to tell that something failed to
>>> restart somewhere down the supervision path.
>>>
>>> In general, the new exit reasons are visible to all processes linked
>>> with supervisors or monitoring them (so parent supervisors as well as the
>>> application master will see these reasons). This is why I chose the
>>> '{shutdown, Reason}' format, which must be supported (according to the
>>> documentation this is considered a normal exit reason). Thus, changing the
>>> exit reasons will not affect the behaviour of supervision hierarchies
>>> (verified by the test suite) or the application master (as far as I can
>>> tell). The backward incompatibilty is located in processes depending on the
>>> undocumented behaviour of supervisors always exiting with normal or
>>> shutdown and not with '{shutdown, Reason}'.
>>>
>>> I hope, that my explanation makes things a bit clearer (and not worse).
>>>
>>> Regards
>>> Tobias
>>>
>>>   ------------------------------
>>> *Von:* Siri Hansen [erlangsiri@REDACTED]
>>> *Gesendet:* Freitag, 23. August 2013 09:50
>>> *An:* Tobias Schlager
>>> *Cc:* erlang-patches@REDACTED
>>> *Betreff:* Re: [erlang-patches] Supervisor shutdown reason when
>>> reaching max restarts
>>>
>>>    Hi Tobias!
>>> Thank you for the patch. We have discussed this on OTP Technical Board,
>>> and have come to the conclusion that some more investigation is needed of
>>> the potential backwards incompatibility. I have written a ticket and the
>>> job will be prioritized into our backlog. Unfortunately we won't make it
>>> before the next release (R16B02).
>>>
>>>  In order to help us a bit on the way, could you please provide some
>>> more information about your use case? You say that you are monitoring the
>>> supervisor from another process, do you mean other process than the
>>> supervisor's supervisor? If so, could you explain this architecture a bit
>>> more?
>>>
>>>  Who else will see this exit reason? - application_master? - the parent
>>> supervisors? - other?
>>>
>>>  Thanks again!
>>> Regards
>>> /siri
>>>
>>>
>>>
>>> 2013/7/4 Tobias Schlager <Tobias.Schlager@REDACTED>
>>>
>>>> Hi,
>>>>
>>>> this patch changes the behaviour of supervisors to exit with a more
>>>> specific reason when exiting due to a maximum restart limit hit. This is
>>>> especially useful (or even necessary) to distinguish between normal and
>>>> erroneous process terminations when monitoring a supervisor from another
>>>> process.
>>>>
>>>> In the above case a supervisor would now exit with {shutdown,
>>>> {reached_max_restart_intensity, Child}} where Child is whatever is
>>>> available to describe the child, either a child id or in case of a
>>>> simple_one_for_one supervisor the offending child's process id. The patch
>>>> should not affect the OTP restart behaviour (also for cascaded supervisors)
>>>> since a subclass of 'normal' exit reasons is used.
>>>>
>>>> I'm aware that there is some potential backward incompatibility for
>>>> people that do not expect {shutdown, Reason} when monitoring a supervisor.
>>>> However, the feature of exiting normally with {shutdown, Reason} has been
>>>> around for quite a while now and I think this could be a sensible place to
>>>> use it. Let me know what you think.
>>>>
>>>> The patch does include tests and updated documentation.
>>>>
>>>>           git fetch https://github.com/schlagert/otp.gitsupervisor_shutdown_reason
>>>>
>>>>
>>>> https://github.com/schlagert/otp/compare/erlang:master...supervisor_shutdown_reason
>>>>
>>>> https://github.com/schlagert/otp/compare/erlang:master...supervisor_shutdown_reason.patch
>>>>
>>>> Regards
>>>> Tobias
>>>> _______________________________________________
>>>> erlang-patches mailing list
>>>> erlang-patches@REDACTED
>>>> http://erlang.org/mailman/listinfo/erlang-patches
>>>>
>>>
>>>
>>
>
> _______________________________________________
> erlang-patches mailing list
> erlang-patches@REDACTED
> http://erlang.org/mailman/listinfo/erlang-patches
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-patches/attachments/20140512/81a995f7/attachment.htm>


More information about the erlang-patches mailing list