[erlang-questions] gen_statem and multiple timeout vs 1

Thu Oct 6 11:30:47 CEST 2016

On Wed, Oct 05, 2016 at 04:41:11PM +0000, Vans S wrote:
> The state_timeout that was added makes more sense to me IMO.  And what Fred said about uses for the current timeout being more limited are valid.

It seems we are reaching a common conclusion on that one...

> 
> 
> 
> I think that if there is support for the way the current timeout is, that times out regardless of the state you are currently in, what is the problem with having one vs multiple of them?

I am confused.  The current timeout {timeout,Time,Msg} can not be used
between different states since it is cancelled by the first event that
arrives, so it is cancelled with the first possibility to change states.

So to have more than one of them would mean as they are defined now
that when the first generates a timeout event the others are cancelled
and therefore pointless.  So I see no point in having multiple
{timeout,Time,Msg}s.

Kind of the same applies to {state_timeout,Time,Msg}.  When you change
states it (they) is (are) cancelled, so having multiple timers while
staying in one state gets kind of toothless.  You could do it but it feels
like you will be tempted to force your machine to stay in one state when
it really should have more states coupled to the different timeouts.

The reason {state_timeout,Time,Msg} is interesting to implement is that it
fits a very common niche of timeouts in state machines and is very simple
to use.

> 
> Is it really that much of an annoying feature to add.  Where the cost of maintenance of the code will be greater then the benefit? 
> I even mentioned I would put in the pull request for it if we can agree it would serve some use. 

The problem is not about having multiple timers, it is about having more
general timers since I am convinced that if you have multiple timers you
will want to have control over when to cancel them, and how, which forces a
new more general timer concept.  Hence my suggestion of
{named_timeout,Time,Name}.

For example I think you mentioned that it would be handy to be able to bump
a timer i.e add more time to a running timer, and then you need to know how
much time a timer has left when you cancel it, which is problematic with
the API that we need to use as gen_statem now looks.  You have to return an
action from a state function to control the timer so you can not act on the
return value from erlang:cancel_timer/1,2, not unless you introduce a fun()
in the action, or define different actions for all possible operations.
And just for at bump action the question is what to do if the timer has
already expired?  Time out or restart?  Are there more desired actions than
bump?

So I think the API in gen_statem which we are limited by makes it hard to
build general enough timers when it is comparatively easy to write a small
timer library that does what you need and keep track of the TimerRefs in
your state machine Data, which gives you full control over the timers.

And in general, as Fred Hebert pointed out, the more state data you get in
the state machine engine, the less control the state machine implementation
gets which for instance decreases your freedom to do what you want during
code upgrade where you only may act on State and Data and can know nothing
about the engine state data.

If we introduce {named_timeout,Time,Name} and it turns out to not become
general enough, most will use erlang:start_timer/3,4 anyway and
we will have seldom used code to maintain.  Forcing users to use
erlang:start_timer/3,4 for all cases where {timeout,Time,Msg} and
{state_timeout,Time,Msg} does not suffice might be the better option
since all will have to understand them anyway, which improves readability.

But if you have a suggestion of a good timer interface for multiple timers
in gen_statem at least I would be very interested to hear it.  I do not
think the suggestions that have surfaced so far have been worthy of
implementation...

/ Raimo

> 
>     On Wednesday, October 5, 2016 9:45 AM, Raimo Niskanen <raimo+erlang-questions@REDACTED> wrote:
>  
> 
>  On Tue, Oct 04, 2016 at 10:52:38PM -0500, Fred Hebert wrote:
> > On 10/04, Vans S wrote:
> > :
> > >I did not consider this. This is truly problematic BUT would not the 
> > >current way timeout works run into this same problem? So this should 
> > >not affect allowing multiple timeouts vs a single timeout.
> > 
> > Right now what you can do if you manage your own timers is to cancel the 
> > old ones or ignore them (you've got their refs so it's easy) and you can 
> > set new ones right away. You can't do that however if you rely on the 
> > return tuple of a state transition to do it, since that transition 
> > cannot happen from a code_change callback.
> 
> To some extent this cat is already out of the box thanks to the gen_fsm
> timeout that got inherited to gen_statem.  If such a timer is running and
> there is a code change on the server, it may expire "earlier" than you
> would like or rather suddenly you might have wanted this particular timeout
> to be longer.
> 
> If this is a real problem you can set a flag in your Data to note that
> there has been a code change, and then act differently when the timeout
> comes, or change states to one that is aware of the possibility of an early
> timeout.
> 
> I wonder how this affects my state_timeout proposal...
> It has got the same corner case, and the same workarounds apply.
> 
> The same would apply to named_timeout, with the same workarounds,
> but the more timers you do not control, the messier to work around,
> I guess...
> 
> > 
> > >Learning from that, I rather this be inside gen_statem. If its not, I 
> > >would have no problem to write my own little timer library for 
> > >cancel_timer+purging mailbox. 
> > >
> > 
> > Usually the pattern I use is to write a short 'reset_<whatever>_timer' 
> > function that returns its ref, and then I can track the ref in my state.
> 
> It may be a bit tricky to get it right, but it is educational.
> 
> The problem in writing a library for this is that there are some options
> on both how to start a timer and how to cancel it, so a library would
> either not get it right for everybody or would have to take very many
> options to be right for everybody.
> 
> So this may be a good candidate for a do-it-as-you-want-it library module.
> 
> / Raimo
> 
> 
> 
> > 
> > I always found this fairly convenient to deal with things, and for 
> > example, I mostly don't clean up timers very hard. I.e. you can make use 
> > of the newer cancel_timer options for async returns with or without 
> > info, and by matching on specific refs, I can just ignore timers that 
> > are for events I know are no longer up for consideration and let them 
> > disappear, and automatically can ignore a bunch of potential race 
> > conditions (i.e. no mailbox cleanup, just ignore messages coming in)
> > 
> > This is also neat to avoid whatever blocking or synchronization that 
> > could take place on timer management, but in some cases you may still 
> > want to do that.
> > 
> > 
> > >Since you are only in ONE state at a time, you automatically know if 
> > >the timeout happened, it MUST correspond to the current state we are 
> > >in. 
> > >
> > 
> > Does it? Couldn't the timeout event have been postponed, and therefore 
> > come from a prior state change? I believe so. There's no relationship 
> > between an event being handled in a state and that event having been 
> > sent in that state.
> > 
> > >Another option is to remove the timeout then. It just seems out of place to me.  
> > >
> > 
> > At the two extremes are the positions that all timeout management is 
> > manual, and that gen_statem is to replace full-on control of timeouts 
> > people can do manually (with all the cancellation options and whatnot).
> > 
> > Those are the two extremes, and the current thing is where on the 
> > gradient does the timeout implementation currently lie. So far it 
> > appears to be a fairly minimal convenience factor. I'm just wondering if 
> > it's worth pushing it further on the gradient of "emulating all the 
> > flexibility of manual control."
> > 
> > >What is the use case of a single timeout with a UNIQUE EventContent?
> > >
> > 
> > The current semantics are specific about one thing:
> > 
> >    Generates an event of event_type() timeout after this time (in 
> >    milliseconds) unless another event arrives in which case this 
> >    time-out is cancelled. Notice that a retried or inserted event 
> >    counts like a new in this respect.
> > 
> > These timeouts are mostly used for one thing in my experience: tracking 
> > inactivity of the state machine. gen_servers and gen_fsms have the same 
> > one. The only pragmatic use case I've made of them were to give myself a 
> > delay of N milliseconds before which it would be reasonable to send the 
> > process in hibernation, GCing and compacting its memory at once, since 
> > it had not received a message in that time otherwise.
> > 
> > All other timeouts I tend to manage by hand because I do not want them 
> > to be interruptible by the arrival of other messages in the mailbox of 
> > the process.
> > 
> > This is a very, very important detail! Those semantics are also a lot 
> > trickier to handle by yourself (you'd need to put them in every clause) 
> > than most other timers you could imagine using. This makes them a fairly 
> > good idea to include in the OTP behaviours themselves, as opposed to 
> > most other timer usages, IMO.
> > 
> > Regards,
> > Fred.
> 
> -- 
> 
> / Raimo Niskanen, Erlang/OTP, Ericsson AB
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://erlang.org/mailman/listinfo/erlang-questions
> 
>    

-- 

/ Raimo Niskanen, Erlang/OTP, Ericsson AB