EEP 56 Update

Wed Mar 10 03:17:28 CET 2021

On 3/9/21 2:48 AM, Maria Scott wrote:
> A specific point on which I would like more feedback is how a combination of "auto_shutdown => never" and child specs with "significant => true" (be it in the return from init/1 or given to start_child/2) settings.
> In actual meaning, the two options don't contradict each other, ie you may mark a child as significant, the supervisor just doesn't care in respect to auto-shutdown.
> It may be that this is rarely what is wanted in practice, but it also has no indisputable negative effects.
> So we think it should be accepted (ie, a "auto_shutdown => never" supervisor should not refuse to start a child marked as significant), but it might be good to log a warning in such a case. What do you think?
>
> A similar question concerns the combination of "restart => permanent" and "significant => true" in a child spec. A permanent child will always be restarted, it doesn't matter if it is marked as significant or not.
> Similar to what was said above, in meaning the two options don't contradict each other, but in practice you might care.
Hi Maria,

The supervisor behaviour is very central in Erlang/OTP and all other 
behaviours build upon it with the usage of their separate modules. My 
view is that the supervisor needs validation to ensure its process state 
space is as minimal as possible.  Ensuring its state is minimal makes it 
easier to diagnose, test, trust and utilize.

I find it hard to argue for not having validation of arguments that are 
invalid.  You could say "We can't possibly consider every dumb thing a 
developer may do, so let's keep our code as minimal as possible so 
everything looks neat and simple in an effort to avoid bugs.".  However, 
if you choose to not block invalid options, it adds complexity that 
makes the Erlang process more difficult to understand (i.e., more 
difficult to understand "Why is this option here? What was this 
developer thinking?").

So, I would urge you to check for "significant == true" when either 
"auto_shutdown == never" or "restart == permanent", to block invalid use 
and ensure the supervisor process doesn't contain invalid state (even if 
the state doesn't cause functionality problems now, it can always become 
a bug in the future).

A separate example of how the supervisor would benefit from more 
validation is when shutdown > 1000 * (MaxT / MaxR) (i.e., if shutdown 
and MaxR are both a pos_integer()).  That situation could be called the 
"immortal child" because the shutdown value can allow the child process 
to infinitely restart if it always takes longer than 1000 * (MaxT / 
MaxR) milliseconds to terminate.  That situation contains a bit of 
ambiguity because the time the child takes to terminate can vary 
randomly, so it may be a transient problem. However, it is able to make 
the supervisor ineffective for the child process.

The "immortal child" bug may not have been fixed to avoid breaking 
legacy source code, I am not sure (hopefully legacy code doesn't exist 
with that bug).  The "significant == true" checks described above are 
focused on keeping the supervisor process state minimal and relevant, 
even though the option could be silently ignored if validation wasn't 
present.  Having the validation present is beneficial both for the 
development of the supervisor module and developing with it because it 
ensures the design lacks ambiguity and edge-cases (both can lead to 
bugs, either in the supervisor module or in the usage of the supervisor 
module).

Best Regards,
Michael