infinite loop when beam.smp compiled with -O2 on debian lenny
Mon May 3 22:29:04 CEST 2010
We hit a bug while running rabbitmq where the beam.smp process was stuck
in a tight loop in the erts_poll_info method.
The process was eating up 100% of exactly one core (on a multi core box) and
rabbitmq was dysfunctional. Unfortunately
I could not create a small test case to reproduce this condition but it
would happen quite frequently while rabbitmq was in
The C code for the function didn't provide any hints on what would have been
spinning in that function
(first time looking at this codebase though). Finally looking through the
disassembly in gdb, (at the point of where our process was spinning) I saw
the following lines in the
0x00000000004f0fe9 <erts_poll_info_kp+185>: nopl 0x0(%rax)
0x00000000004f0ff0 <erts_poll_info_kp+192>: jmp 0x4f0fe9
(Similar assembly code can be seen when the KERNEL_POLL option is
Clearly the above will trivially spin forever anytime we get into that
codepath. The above
looks suspiciously like some code got optimized out by the compiler leaving
So I compiled with -O1 and then with no optimization at all. Withe -O1, I
a weird jmp insruction jumping to it's own address:
0x0000000000517102 <erts_poll_info_kp+60>: jmp 0x517102
With no optimization, any of those trivial spins did not exist but I
didn't analyze the unoptimized
code enough to say whether it can be proven to have an infinite loop (i.e.,
whether the optimizing
compiler is simply doing it's job vs. this being a compiler bug).
Anyway, this problem exists at least since erlang-base_12.b.3-dfsg debian
package version and has been
verified to exists in the github version as of today.
Her'es the gcc and debian version info:
$ gcc --version
gcc-4.3.real (Debian 4.3.2-1.1) 4.3.2
Copyright (C) 2008 Free Software Foundation, Inc.
$ cat /etc/debian_version
I'd be happy to provide any other info as needed.
More information about the erlang-bugs