[erlang-questions] gen_statem - state_timeout sometimes reaching callback module as [info, {timeout, Ref, Name}

Raimo Niskanen raimo+erlang-questions@REDACTED
Wed Apr 17 11:54:42 CEST 2019


After some discussions with the VM guys this seems to be a bug in
gen_statem coming from a misunderstanding of which guarantees that
asynchronous timer cancels gives and does not give.

The {timeout, Ref, Msg} message may actually come _after_ the
{cancel_timer, Ref, false} message.  Heavy load seems to increase the
probability from almost zero to very small.

So the handling of these messages is broken for gen_statem on OTP-21, but
this should be fixed on master (soon to be OTP-22) since gen_statem there
does not use asynchronous timer cancels.

I will look into how big a diff it will be to rewrite gen_statem in OTP-21.latest
to use synchronous timer cancel...

Thank you for finding a tricky bug!
/ Raimo



On Fri, Apr 12, 2019 at 04:13:27PM +0100, Peter Morgan wrote:
> Hi Raimo,
> 
> Thanks for looking at this, and the background on the implementation which is very useful. I am in the process of putting together a simple test case, but having trouble reproducing it. It appears to happen very infrequently - possibly something that is load related? The original output was from the Crash Report which I’ve attached - where you can see the info timeout was received by our gen_statem behaviour callback. Our OTP is 21.2 (without any 21.2.X patches):
> 
> {stdlib,"ERTS  CXC 138 10","3.7”}
> {kernel,"ERTS  CXC 138 10","6.2”}
> 
> 
> Your question - “why so many?” - it could be a failure of a downstream service that is ultimately causing this, and we’re not backing off in this case - something that I’m checking too. However, I still thought it was interesting that the info timeout reaches the behaviour module.
> 

> =CRASH REPORT==== 8-Apr-2019::23:37:19.038905 === <0.969.0> proc_lib:crash_report/4:508
>   crasher:
>     initial call: kafire_fetcher:init/1
>     pid: <0.969.0>
>     registered_name: []
>     exception error: no function clause matching 
>                      kafire_fetcher:handle_event(info,
>                                                  {timeout,
>                                                   #Ref<0.2399112782.889192450.118024>,
>                                                   fetch},
>                                                  subsequent_fetches,
>                                                  our_state) (src/kafire_fetcher.erl, line 445)
>       in function  gen_statem:call_state_function/5 (gen_statem.erl, line 1662)
>       in call from gen_statem:loop_event_state_function/6 (gen_statem.erl, line 1023)
>     ancestors: [<0.931.0>,<0.332.0>,kafire_cluster_sup,kafire_sup,<0.227.0>]
>     message_queue_len: 279
>     messages: [{timeout,#Ref<0.2399112782.903610369.70160>,fetch},
>                   {cancel_timer,#Ref<0.2399112782.903610369.70160>,false},
>                   {cancel_timer,#Ref<0.2399112782.903610369.70168>,4},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118038>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118039>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118043>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118044>,4},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118045>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118049>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118050>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118051>,4},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118052>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118053>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118054>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118055>,4},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118056>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118060>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118061>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118062>,4},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118063>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118064>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118068>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118069>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118070>,4},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118071>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118074>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118079>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118081>,4},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118083>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118087>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118088>,0},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118089>,4},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118093>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118094>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118095>,3},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118096>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118097>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118098>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118099>,3},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118103>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118104>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118105>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118106>,3},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118107>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118108>,1},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118109>,4},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118113>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118114>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118115>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118116>,4},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118117>,4},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118118>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118119>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118120>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118121>,4},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118122>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118123>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118124>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118125>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118126>,4},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118127>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118128>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118129>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118130>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118131>,4},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118132>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118133>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118134>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118135>,4},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118136>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118137>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118138>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118139>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118140>,4},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118141>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118145>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118146>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118147>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118148>,4},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118149>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118150>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118151>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118152>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118153>,4},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118154>,5},
>                   {timeout,#Ref<0.2399112782.889192450.118159>,fetch},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118159>,false},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118165>,0},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118169>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118170>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118173>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118174>,4},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118175>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118176>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118177>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118178>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118179>,4},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118180>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118181>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118182>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118183>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118184>,4},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118185>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118189>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118190>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118191>,4},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118192>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118196>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118197>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118198>,4},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118199>,5},
>                   {timeout,#Ref<0.2399112782.889192450.118203>,fetch},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118203>,false},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118206>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118207>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118208>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118212>,4},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118213>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118214>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118215>,4},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118221>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118225>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118226>,4},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118227>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118231>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118235>,4},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118236>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118237>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118238>,2},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118242>,4},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118243>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118244>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118248>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118249>,4},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118250>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118254>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118255>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118256>,4},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118260>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118264>,0},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118265>,4},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118266>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118270>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118271>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118272>,4},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118273>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118274>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118278>,4},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118279>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118280>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118281>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118282>,4},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118283>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118287>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118288>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118289>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118290>,4},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118291>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118295>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118296>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118297>,4},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118298>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118299>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118303>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118304>,4},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118305>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118306>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118307>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118308>,4},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118309>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118313>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118314>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118315>,4},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118316>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118317>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118318>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118319>,4},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118323>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118324>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118325>,5},
>                   {timeout,#Ref<0.2399112782.889192450.118326>,fetch},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118326>,false},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118327>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118328>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118329>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118330>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118331>,4},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118332>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118333>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118337>,3},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118338>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118339>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118340>,4},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118344>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118345>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118346>,4},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118347>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118351>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118352>,4},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118353>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118357>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118358>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118359>,4},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118360>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118364>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118365>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118366>,4},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118367>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118371>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118372>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118373>,4},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118377>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118378>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118379>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118383>,4},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118384>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118385>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118389>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118390>,4},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118391>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118392>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118393>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118394>,0},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118395>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118399>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118400>,4},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118401>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118402>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118403>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118404>,4},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118405>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118409>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118410>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118411>,4},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118412>,1},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118413>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118414>,4},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118415>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118419>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118420>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118421>,4},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118422>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118423>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118424>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118425>,4},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118429>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118430>,0},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118431>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118432>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118433>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118434>,4},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118435>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118439>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118440>,1},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118441>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118442>,4},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118443>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118444>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118445>,4},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118449>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118450>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118451>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118452>,4},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118453>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118454>,5},
>                   {cancel_timer,#Ref<0.2399112782.889192450.118455>,5},
>                   {timeout,#Ref<0.2399112782.889192450.118456>,fetch},
>                   {'EXIT',<0.10080.8>,normal},
>                   {'EXIT',<0.10079.8>,normal},
>                   {'EXIT',<0.10081.8>,normal},
>                   {'EXIT',<0.10082.8>,normal},
>                   {'EXIT',<0.10083.8>,normal},
>                   {'EXIT',<0.10084.8>,normal},
>                   {'EXIT',<0.10085.8>,normal},
>                   {'EXIT',<0.10086.8>,normal},
>                   {'EXIT',<0.10087.8>,normal},
>                   {'EXIT',<0.10088.8>,normal},
>                   {'EXIT',<0.10089.8>,normal},
>                   {'EXIT',<0.10071.8>,normal}]
>     links: [<0.931.0>]
>     dictionary: []
>     trap_exit: true
>     status: running
>     heap_size: 17731
>     stack_size: 27
>     reductions: 28849413204
>   neighbours:

> 
> 
> I’ll try and get a simple test case for you.
> Regards,
> Peter.
> 
> 
> 
> > On 12 Apr 2019, at 13:38, Raimo Niskanen <raimo+erlang-questions@REDACTED> wrote:
> > 
> > On Fri, Apr 12, 2019 at 10:38:42AM +0200, Raimo Niskanen wrote:
> >> On Wed, Apr 10, 2019 at 01:41:06PM +0100, Peter Morgan wrote:
> >>> Hello -
> >>> 
> >>> We are _sometimes_ seeing cases where a state_timeout in a gen_statem results in the timeout reaching the callback module as info {timeout, Ref, Name} with OTP 21.2.
> >>> 
> >>> The crash looks like:
> >>> 
> >>> =ERROR REPORT==== 8-Apr-2019::23:37:19.035346 === <0.969.0> gen_statem:error_info/5:1895
> >>> ** State machine {kafire_fetcher,<<“abc">>,<<“def">>,0} terminating
> >>> ** Last event = {info,{timeout,#Ref<0.2399112782.889192450.118024>,fetch}}
> >>> 
> >>> We crash because we are not expecting info messages - interestingly the following messages are in the queue for the crashed process:
> >>> 
> >>> 
> >>>    message_queue_len: 279
> >>>    messages: [{timeout,#Ref<0.2399112782.903610369.70160>,fetch},
> >>>                  {cancel_timer,#Ref<0.2399112782.903610369.70160>,false},
> >>>                  {cancel_timer,#Ref<0.2399112782.903610369.70168>,4},
> >>>                  {cancel_timer,#Ref<0.2399112782.889192450.118038>,5},
> >>>                  {cancel_timer,#Ref<0.2399112782.889192450.118039>,5},
> >>>                  {cancel_timer,#Ref<0.2399112782.889192450.118043>,5},
> >>>                  {cancel_timer,#Ref<0.2399112782.889192450.118044>,4},
> >>>                  {cancel_timer,#Ref<0.2399112782.889192450.118045>,5},
> >>>                  {cancel_timer,#Ref<0.2399112782.889192450.118049>,5},
> >>>                  {cancel_timer,#Ref<0.2399112782.889192450.118050>,5},
> >>>                  {cancel_timer,#Ref<0.2399112782.889192450.118051>,4},
> >>>                  {cancel_timer,#Ref<0.2399112782.889192450.118052>,5},
> >>>                  {cancel_timer,#Ref<0.2399112782.889192450.118053>,5},
> >>>                  {cancel_timer,#Ref<0.2399112782.889192450.118054>,5},
> >>>                  {cancel_timer,#Ref<0.2399112782.889192450.118055>,4},
> >>>                  {cancel_timer,#Ref<0.2399112782.889192450.118056>,5},
> >>>                  {cancel_timer,#Ref<0.2399112782.889192450.118060>,5},
> >> 
> >> Is that from a process_info() call?
> >> 
> >>> 
> >>> Followed by lots more cancel_timer messages (200ish!). Our gen_statem does use {state_timeout, 500, fetch} so we are expecting a “fetch” timeout to happen, and we use repeat_state_and_data to requeue the state timeout (and also transition to other states). Is it possible that timeouts in gen_statem can be delivered as an info message?
> >> 
> >> It is not supposed to happen.  This must be a bug.
> >> 
> >> The gen_statem engine keeps track of the running state_timeout timer
> >> and should never present it as an info event.
> >> 
> >> When a timer is stopped (or restarted) it is done with an asynchronous
> >> cancel (in OTP 21) so the cancel_timer messages comes from that.
> >> They should be matched against map entries in the internal gen_statem
> >> engine state when they arrive so them being in the inbox may be ok.
> >> 
> >> However, how come you have so many?  Are you restarting the state_timeout
> >> in a very tight loop of repeat_state:s?  Can you show the essential parts
> >> of the code that causes state_timeout:s?
> >> 
> >> Also - OTP-21.2; is that your exact OTP version?  What is your stdlib version?
> >> 
> >> I will look at the relevant parts of the code with the new knowledge that
> >> a state_timeout timer can be lost, probably in combination with
> >> repeat_state and state_timeout restart!
> > 
> > 
> > Sorry, the knowledge did not help; I did not find anything suspicous.
> > 
> > More information will be needed...
> > 
> >> 
> >> On master since Jan 21 so in OTP-22.0-rc1 among other changes the timer handling
> >> has been rewritten to use a synchronous cancel, which simplified the code
> >> significantly.  I do not know if that would be worth trying?
> >> 
> >> / Raimo Niskanen
> >> 
> >> 
> >>> 
> >>> Thanks
> >>> Peter.
> >>> 
> > 
> > -- 
> > 
> > / Raimo Niskanen, Erlang/OTP, Ericsson AB
> > _______________________________________________
> > erlang-questions mailing list
> > erlang-questions@REDACTED
> > http://erlang.org/mailman/listinfo/erlang-questions
> 


-- 

/ Raimo Niskanen, Erlang/OTP, Ericsson AB



More information about the erlang-questions mailing list