Re: Supervisor and configuration update — how to do it?

Mon May 25 00:19:36 CEST 2020

On 5/22/20 7:52 AM, Max Lapshin wrote:
> I have the same repeated pattern in many places:
>
> I start process, it has some configuration that can be read and
> changed on fly without disconnecting sockets and releasing resources.
>
> This process has some siblings-helpers that are launched after it and
> that are connected to him. Usually this is a one-for-all-of-them
> strategy.
>
> I think of trying to find some common and reasonable pattern here with
> editing supervisor there.
>
> Right now we mostly use external configuration converger: process that
> sleeps for several seconds and then wakes up and starts checking if
> whole system if properly configured: all required processes are
> started or killed.
>
> It works, but it is not as smooth as it can be.
The external configuration converger sounds like a problem to me because 
your description doesn't sound like fail-fast behavior, if the converger 
doesn't always control the configuration changes (it sounds like it 
doesn't because you said it was external).  The sleep delay before it 
resolves configuration problem should be making the failures slow.

If you have a process that owns the configuration (I call the one in 
CloudI "configurator") do synchronous requests (spawn_link with 
gen_server:call, or some similar approach can make the synchronous 
requests occur in parallel) when changing configuration, the response 
would tell you whether it succeeded or not.  That would allow the 
configuration process to fail-fast.  That also lets the process restarts 
be reserved for unexpected errors (as much as possible, the bugs 
developers are unable to anticipate).

Best Regards,
Michael