[erlang-patches] fix supervisors restarting temporary children

Frédéric Trottier-Hébert <>
Tue Aug 23 20:38:03 CEST 2011


On 2011-08-23, at 03:48 AM, Henrik Nord wrote:

> On 08/03/2011 05:07 PM, Frédéric Trottier-Hébert wrote:
>> git fetch :ferd/otp.git fix_supervisor_temporary_restart
>> 
>> This patch fixes the issue discussed at http://erlang.org/pipermail/erlang-questions/2011-August/060419.html, namely that in one_for_all or rest_for_one strategies, the supervisor is still trying to restart temporary children. Because we don't track their MFAs entirely, this causes repetitive crashes.
>> 
>> This patch fixes the behaviour by inserting a clause in terminate_children/2-3 (private function) that will omit temporary children when building a list of killed processes, to avoid having the supervisor trying to restart them again.
>> 
>> Only supervisors in need of restarting children used the list, so the change should be of no impact for the functions that called terminate_children/2-3 only to kill all children. This also avoids useless iterations if the filtering step were to be inserted at any other point in the code.
>> 
>> A test suite has been added to supervisor_SUITE, containing checks for both one_for_all and rest_for_one. I've only ran the tests for the suite itself, but all of them pass without a problem.
>> 
>> The commit/diff itself: https://github.com/ferd/otp/commit/25aab8efdef5c6898a679705158627cded3f847a
>> 
>> --
>> Fred Hébert
>> http://www.erlang-solutions.com
>> 
>> 
>> 
>> _______________________________________________
>> erlang-patches mailing list
>> 
>> http://erlang.org/mailman/listinfo/erlang-patches
> 
> Hello
> 
> Your patch has been reviewed, and we want you to fix the following for it to be graduated.
> 
> 1) I think the test needs to be extended a bit. I can't see that it checks what happens to the temporary child - i.e. that it is terminated but not restarted. As far as I can understand it only checks that the supervisor survives...
> 
> 2) There is a minor typo in the description of the test - it says one_for_rest instead of rest_for_one.
> 
> 3) The behaviour must be documented - i.e. it must be explicitly stated that temporary children under one_for_all or rest_for_one supervisors will be terminated but not restarted.
> 
> /siri
> 
> 
> Thank you.
> 
> -- 
> /Henrik Nord Erlang/OTP
> 

1) Good catch. The broken behaviour would crash the supervisor -- and I tested that this was solved, but not that the child isn't retarted. This has been fixed.

2) also fixed

3) I changed it. I wasn't too sure where to put it, but there was only one place where temporary children were discussed so I added it there.

I've amended the commit: https://github.com/ferd/otp/commit/4f9b938112a80f1c1e210b1a6af40a42f833cfa2

The fetch command was: git fetch :ferd/otp.git fix_supervisor_temporary_restart

Thanks for your time,

--
Fred Hébert
http://www.erlang-solutions.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-patches/attachments/20110823/76b977c8/attachment.html>


More information about the erlang-patches mailing list