delayed child restart with incremental back-off

Maria Scott maria-12648430@REDACTED
Mon May 3 12:18:19 CEST 2021


Hi

> I have not looked at the patch,

Neither have I =^^=

> but something like this would be good to 
> have. Then we could get rid of supervisor2 in RabbitMQ ( 
> https://github.com/rabbitmq/rabbitmq-server/blob/master/deps/rabbit_common/src/supervisor2.erl#L15 
> for the delay part, non-backoff in our case ).

I have only read the comment (4) explaining the delay behavior in supervisor2, and I guess it does things a bit different from what the OP seems to ask for. Specifically, it says that when a child exceeds the restart limit, another restart attempt will be delayed instead of the supervisor shutting down. What the OP asks for, if I understand correctly, is delays between restart attempts in general (right?)

> I was going to see if Maria/Jan had interest in providing a patch for 
> this as well, so I'm glad that there's others showing interest.

Hm, not sure (yet). Since we're talking supervisor, another EEP will be required. This seems to be a somewhat controversial topic with a long history, and I think there are valid arguments for as well as against delays. As it is too late for OTP/24 now anyway (and I have no immediate use case for it myself), I would let the discussion run on for a while and see where it leads before attempting anything ;)

> > In general, if I could, I would use restart delays with exponential
> > backoff everywhere because in practice, restarting immediately is almost
> > never the right approach: code errors do not disappear when restarting

They won't disappear after a delay, either. Just saying ;)

Kind regards,
Maria


More information about the erlang-questions mailing list