From carlsson.richard@REDACTED Mon May 10 15:03:23 2021 From: carlsson.richard@REDACTED (Richard Carlsson) Date: Mon, 10 May 2021 15:03:23 +0200 Subject: delayed child restart with incremental back-off In-Reply-To: <87y2d0w9p3.fsf@valhala.localdomain> References: <87y2d0w9p3.fsf@valhala.localdomain> Message-ID: What happened at the time was that I met up with the OTP team and discussed it, and they eventually agreed that this was a good thing. However, it needed more work to be accepted (and I realized a couple of weaknesses in the implementation that I needed to address), but I never found time to do more work on it. /Richard Den fre 30 apr. 2021 kl 09:28 skrev Nicolas Martyanoff : > > Hi, > > Nine years ago, an interesting patch [1] was submitted by Richard Carlsson > allowing to delay the re-creation of failed children in supervisors. > > After a quick discussions, the official answer was that the OTP team > would discuss about it [2]. There is no further message on the mailing > list. > > Was there an official response ? > > I have various supervisors whose children handle network connections. > When something goes wrong with the connection, children die and are > immediately restarted. Most of the times, errors are transient (remote > server restarting, temporary network issue, etc.), but retrying without > any delay is pretty much guaranteed to fail again. And of course after a > few retries, the application dies which is unacceptable. > > This kind of behaviour is a huge problem: it fills logs with multiple > copies of identical errors and causes a system failure. > > In general, if I could, I would use restart delays with exponential > backoff everywhere because in practice, restarting immediately is almost > never the right approach: code errors do not disappear when restarting > so they are going to get triggered again immediately, and external errors > are not magically fixed by retrying without any delay. > > Is there still interest for this patch ? > > [1] https://erlang.org/pipermail/erlang-patches/2012-January/002575.html > [2] https://erlang.org/pipermail/erlang-patches/2012-January/002597.html > > -- > Nicolas Martyanoff > http://snowsyn.net > khaelin@REDACTED > -------------- next part -------------- An HTML attachment was scrubbed... URL: