EEP proposal - Delayed restarts of supervisor children

Fred Hebert mononcqc@REDACTED
Fri Jun 25 14:47:27 CEST 2021


I haven't had time to think this through the most thoroughly; the client of
mine Adam linked to was something I threw together to try gen_statem when
it was being proposed and we were discussing using handle_event vs. named
state for complex events. I don't think it should apply universally to all
clients as structured, although the state transitions can be interesting.

The core thing to worry about for behaviours has to be related to what's
generic and shared by all implementations, and what's specific and must be
dealt with individually.
Connection-oriented clients tend to share similar concerns but can get real
funky fast.

Something like heartbeats and whatnot can be seen as a basic  functionality
to detect whether something should timeout (and therefore if you don't want
to deal with heartbeat it the generic side you can offer configurable
timeouts that can be ignored but won't be able to know what should reset a
timeout or not),  and some protocols have to do multiple layers of
synchronization.

Think of a stateful handshake going through TLS, then a gRPC back-and-forth
between client and server, then an authentication sequence, before
providing a token that then requires being carried across all messages. How
is that represented? Can they time out independently such that the auth
session can timeout before the connection? This could happen with a stable
network where connections don't break but SAML auth lasting <24h. Maybe
crashing is okay then. You wouldn't want to have 3 connection processes for
each actual app-level connection, that'd be costly.

The OSI layered approach would be a fine model if it were not for the pesky
real world never obeying it! In front of that complexity. And what of that
annoying thing where some sessions or connections can be set up
synchronously or asynchronously? That's gonna suck. Maybe both can be
handled, or it's expected that the person writing the client just deals
with all this garbage in a single 'connect(Opts, State) -> {ok, Conn, State
[, Timeout]} | {error, Reason, State}' callback.

Specifically if what we want to deal with is abstracting away retries,
there could be init options that let you specify a retry type that could be
exponential, based on a circuit breaker, or just a callback with state
(which lets the user specify how to deal with things). The latter could let
you have a retry where a common interface can be used to carry state and
deal with various implementations such that you can transparently switch
between a blessed implementation of exponential backoffs with jitter, a
trickle circuit breaker, or a shared one. The management of such a circuit
breaker could be left external and so on.

But the scope of the module needs to be limited enough that you don't start
having to absorb the functionality of other things (multiplexed sessions,
understanding of transactions).

Anyway, I haven't yet thought this through, but it feels possible to get
something that would mostly work there. I'm expecting to be fully surprised
by the unexpected, so it'd be fun to think of things like connection-less
clients (say DNS) or hybrid modes (QUIC) and how they could be fit to this
pattern.

On Wed, Jun 23, 2021 at 5:13 AM Adam Lindberg <hello@REDACTED> wrote:

> There was a similar example made by Fred a few years ago:
> https://gist.github.com/ferd/c86f6b407cf220812f9d893a659da3b8
>
> Cheers,
> Adam
>
> > On 22 Jun 2021, at 20:09, Maria Scott <maria-12648430@REDACTED>
> wrote:
> >
> > I was just thinking... Could we do something like this as a rebar/
> erlang.mk template maybe? Like, a gen_statem with batteries included?
> That would leave the maximum degree of freedom to users while providing a
> good starting point from which to customize, and without burdening any
> technical debt on us or anybody else in OTP.
> >
> > Kind regards,
> > Maria
> >
> > -------- Ursprüngliche Nachricht --------
> > Von: Maria Scott <maria-12648430@REDACTED>
> > Datum: 22.06.21 16:20 (GMT+01:00)
> > An: Viktor Söderqvist <viktor@REDACTED>, eeps@REDACTED
> > Betreff: Re: EEP proposal - Delayed restarts of supervisor children
> >
> > Hi Viktor,
> >
> > hm, the idea sounds interesting. A bit like a specialized gen_statem, at
> first thought.
> >
> > But I guess I won't be easy to find something that is special enough to
> warrant not using a gen_statem itself, but general enough to be able to
> cover most of the common use cases with it.
> > Depending on the characteristics of the external service, the
> requirements for the client, and the behavior of connection between the
> two, many possibilities exist which the hypothetical gen_client should be
> able to account for.
> >
> > Let's hear what Fred thinks ;)
> >
> > > A different note regarding automatic reconnects in clients: They may
> be
> > > problematic, since there may be some state associated with the
> > > connection (such as an ongoing database transaction) which is lost if
> > > automatic reconnect is done without care. Crashing instead of
> > > reconnecting makes this handling way simpler (or at least it moves the
> > > problem to somewhere else). How would you best solve this using the
> > > hypothetical gen_client behaviour?
> >
> > Automatic reconnecting is not a problem in itself if you ask me. It is a
> problem if it happens _transparently_, ie if processes using the client
> have no way of noticing it. I think it should even be made _impossible_ to
> use a reconnected client without the user process being informed and
> performing some extra steps in order to use it again.
> > What I'm imagining (without having given it too much thought) is to let
> the client manage a token (a reference maybe) which users can ask for and
> have to provide together with requests. On reconnect, the client changes
> that token, thus invalidating all requests made with the old one. Like this:
> >
> > * client C is connected, his current token is T1
> > * user U wants to use C and asks it for its token, receives T1
> > * U sends {T1, Request1} to C; C accepts as T1 matches its own token
> > * C's connection fails, he changes the token to T2 and reconnects
> > * U, unaware of C having reconnected, sends {T1, Request2} to C; C
> rejects because T1 does not match its own token
> > * thus, U knows that C has reconnected and that any connection-related
> state is lost; if he decides to continue using C, he must ask for the
> current token, and receives T2
> > * U sends {T2, Request2} to C; C accepts as T2 matches its own token
> > * etc
> >
> > Kind regards,
> > Maria
> > _______________________________________________
> > eeps mailing list
> > eeps@REDACTED
> > http://erlang.org/mailman/listinfo/eeps
> > _______________________________________________
> > eeps mailing list
> > eeps@REDACTED
> > http://erlang.org/mailman/listinfo/eeps
>
> _______________________________________________
> eeps mailing list
> eeps@REDACTED
> http://erlang.org/mailman/listinfo/eeps
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/eeps/attachments/20210625/5a64b829/attachment.htm>


More information about the eeps mailing list