how do I do the equivalent of ets:tab2list(timer_tab) for BIF timers

Matthias Lang matthias@REDACTED
Mon Dec 6 23:58:08 CET 2010


In short: A bug in the linux kernel on Au1000 MIPS CPUs causes
       Erlang's timers and timeouts to go haywire, sometimes.
       Most likely, nobody but me is affected. Erlang is not the problem.

Longer version follows as a reply to my own post. It's only of
interest to the curious.

On Tuesday, November 23, Matthias Lang wrote:
> How can I see a list of timers started by erlang:send_after/3 and
> 'receive ... after' constructs?
> The timer module "lets" me peek at timers fairly easily, but that only
> works for timers started through the timer module, i.e.:

Answer: "recompile with debug enabled and call p_slpq()".

Michal Zajda suggested looking at the "nbif_timer" parameter in the
binary returned by erlang:system_info(info). That doesn't do what I
want, the "bif_timer" parameter only tells me how much memory is
allocated for timers, nothing more.

Erlang "bif timers", i.e. 'receive...after' timers and 'erlang:send_after'
timers, are stored in a structure controlled by erts/emulator/beam/time.c.
The debug-only function 'p_slpq()' does roughly what I wanted.

The actual problem I was chasing was caused by calls to

      clock_gettime(CLOCK_MONOTONIC, &ts)

sometimes, rarely, jumping several days into the future, only to jump
back again. This makes Erlang's timers go haywire, e.g. a 5s timeout
suddenly becomes a multi-day timeout, which makes any Erlang processes
waiting for a timeout seem frozen.

The root cause was a concurrency/locking problem: a missing lock in an
interrupt routine which meant that a call to clock_gettime() at the
"wrong" moment could read junk.


More information about the erlang-questions mailing list