[erlang-questions] Dependencies, included applications and supervision

Thu Jun 26 18:03:34 CEST 2014

On Jun 25, 2014, at 5:10 PM, Fred Hebert <mononcqc@REDACTED> wrote:

> What I see for these is therefore:
> 
> 1. A presence management and registration app, let's call it presencerl;
> 2. A lightweight self-registration process that contacts presencerl to
>   register.
> 
>      (MyApp)                        (Presencerl)
>     [MyApp_sup]                  [presencerl_sup]---ETS
>     /    |   \                      /         \
>  [w1]  [w1]  [agent]   [presencerl_local] [presencerl_remote]
> 
> Where MyApp_sup can be using a rest_for_one strategy. The 'agent'
> process is in charge of calling something like 'presencerl:register(App,
> self())', and then it can just hibernate forever.

We may try this just for grins. It is closer to the service style of things
that I think should be in OTP’s future. The key is using monitors
properly. I’ve been leaning toward rewriting our supervisor hierarchies
to be shallower without children on init, then triggering the population
of the hierarchy later with an FSM-style controller that is the marionette-
master.

> Using it that way, you decouple your service discovery from your
> principal app, and can build multiple apps that all depend on presencerl
> independently.You can even have 'sub-services' if you want, and you

> just did away with your included applications.
> 
> You can also disable specific services during partial upgrades without
> needing to shut down other services running on the same node!

It’s really just a tradeoff. Your goal was to eliminate included applications,
mine was to have them so I have more control of the app. Just a preference
(with consequences).

I can understand fear of the complexity that arises with deep hierarchies and
startup sequences, but I also see that in code where 8 or 10 applications
are started there is no thought to how they interact. There is just a big
function that starts them all, or the new ensure_all_started hack and
no thought into how things shutdown because everyone who touches
the code adds “just one more” app. I would rather force new additions
to have put some thought into why and when they are running and how
failure occurs and what are its implications.

> 
>> 
>> The “lesser alternatives” I referred to resulted in situations where the
>> presence subsystem went away but the service was available and running
>> just fine, but not receiving traffic because it appeared to be offline.
>> 
> 
> This doesn't really go away in this case, unless you make the presencerl
> app permanent, which can then have it share similar failure thresholds
> to your main app, something that is somewhat equivalent to shoving both
> under the same supervisor.

For presence I would definitely make it a permanent app. We have experienced
real situations where the node is functioning normally and not advertising its
availability. That is a problem for our service. And that was the fear of OP’s
teammates originally. One we’ve experienced.

jay