EEP proposal - Delayed restarts of supervisor children

Maria Scott maria-12648430@REDACTED
Thu Jun 17 11:22:09 CEST 2021


Hi,

thanks for your comments :)

> If 'undefined' and 0 are equivalent, let's just drop 'undefined' and 
> make 0 the default. No need to complicate things further.

We have been discussing this back and forth for a while...
On the one hand, in all practice 0 is de facto the "real" default. Temporary children (which won't restart) can just ignore it, whatever it is set to.
On the other hand, in the previous EEP there was a strong requirement for rejecting option combinations that made no sense (like permanent+significant). The same requirement will most likely appear in this case, too. With that in mind, _any_ integer value makes no sense for temporary children, not even 0.

> Since it is already possible (though rare) to start or restart a child 
> that is currently restarting and has failed to restart at least once, I 
> suppose the same behavior should apply when there are delays. At least 
> if the current behavior is well tested, is it? If not then we can just 
> pick the most sensible solution.

Ok, that part about force-restarting can be misleading, I should rewrite it.
Truth is, it is currently _not_ possible to restart a child that is already restarting, failed or not. With undelayed restarts, it makes no sense, or difference. With delayed restarts however, it may become desirable, at least in the one_for_one strategy (force-restarting single children in one_for_all or rest_for_one is probably pointless and/or dangerous, but we cannot right out exclude them).

What _is_ already possible is that a new child could be started between restart retries, but I'm pretty certain that this is not by intention (and that said, no, it is completely untested, and I wouldn't even know how to test it if I wanted to).
It is likely a problem that never hit anybody, as the likelihood of it appearing is extremely small. For one, it is not a problem at all in one_for_one strategies, and adding children dynamically in one_for_all or rest_for_one strategies is unusual, they tend to be rather static. Even in those cases where it is done, the restarting must fail at least once, and the request to start a new child must happen in the tiny window between the initial crash/restart failure of a child and the next restart attempt. It should be extremely rare that all those conditions meet, and even if, that it resulted in a problem, and even if that, that this problem was not simply solved by another restart and so went unnoticed.

What is different with delayed restarts is that the aforementioned window is wider, it exists for the entire duration of the delay. It also exists between the initial crash and the first restart, but that is probably a minor detail.

> The copyright line must be updated, see the updated template.

This: "This document is placed in the public domain or under the CC0-1.0-Universal license, whichever is more permissive."?
I admit that I didn't read carefully and interpreted it as "pick one" ^^;

Kind regards,
Maria


More information about the eeps mailing list