delayed child restart with incremental back-off
Michael Truog
mjtruog@REDACTED
Tue May 4 09:12:06 CEST 2021
On 5/3/21 11:15 PM, Nicolas Martyanoff wrote:
> zxq9 <zxq9@REDACTED> writes:
>
>> You don't have to implement your own supervisor to get this kind of behavior,
>> simply move connection out of initialization. As a general rule initialization
>> should never be dependent on anything outside your node's control --
>> especially not something across the network.
> I do not know why there is such a focus on initialization. Errors can
> occurs during the entire lifecycle of a process; it is common to end up
> in a situation where a worker will fail *after* initialization, and this
> failure will repeat due to external consequences or to a coding mistake.
> In that situation, initialization tricks will not help you: the process
> will crash N times in a row, filling the logs with duplicate error
> messages, then the entire program will die. This is not acceptable for a
> server.
>
The reason is due to initialization being a short period of time that
can have a timeout value to limit the execution (and being the
precondition for all later execution). It is better to have something
fail during initialization when compared to 5 days later. If a failure
after x days is difficult to replicate, you still don't want to wait
that length of time to test. That is why it is best to validate
everything during initialization to ensure the undefined runtime length
after initialization is valid. Otherwise you are just wasting
development time when bugs occur.
Best Regards,
Michael
More information about the erlang-questions
mailing list