Re: Supervisor and configuration update — how to do it?
Mon May 25 00:19:36 CEST 2020
On 5/22/20 7:52 AM, Max Lapshin wrote:
> I have the same repeated pattern in many places:
> I start process, it has some configuration that can be read and
> changed on fly without disconnecting sockets and releasing resources.
> This process has some siblings-helpers that are launched after it and
> that are connected to him. Usually this is a one-for-all-of-them
> I think of trying to find some common and reasonable pattern here with
> editing supervisor there.
> Right now we mostly use external configuration converger: process that
> sleeps for several seconds and then wakes up and starts checking if
> whole system if properly configured: all required processes are
> started or killed.
> It works, but it is not as smooth as it can be.
The external configuration converger sounds like a problem to me because
your description doesn't sound like fail-fast behavior, if the converger
doesn't always control the configuration changes (it sounds like it
doesn't because you said it was external). The sleep delay before it
resolves configuration problem should be making the failures slow.
If you have a process that owns the configuration (I call the one in
CloudI "configurator") do synchronous requests (spawn_link with
gen_server:call, or some similar approach can make the synchronous
requests occur in parallel) when changing configuration, the response
would tell you whether it succeeded or not. That would allow the
configuration process to fail-fast. That also lets the process restarts
be reserved for unexpected errors (as much as possible, the bugs
developers are unable to anticipate).
More information about the erlang-questions