[erlang-bugs] infinite loop when beam.smp compiled with -O2 on debian lenny
Chetan Ahuja
chetan.ahuja@REDACTED
Tue May 4 01:43:27 CEST 2010
Mikeal,
Thanks a lot for that catch. I think that's it. Just did recompiles with
your patch (with -O2) and the body of the loop now shows up in the generated
code and the trivial spin loop is gone.
I got blindsided by the optimizer completely eliminating the body of the
loop, due to which I couldn't even see urbqp on the stack at all !! This
led me to the assumption that the surrounding macro
(ERTS_POLL_USE_UPDATE_REQUESTS_QUEUE) was perhaps undefined and that loop
wasn't even compiled in. Yet another strike against coding C in
pre-processor macros.
Overall, it's a big relief to know that our standard install of gcc is
not generating such obviously buggy code. I look forward to seeing the
erts_poll_info fix in an upcoming git version.
Thanks a lot once again
Chetan
On Mon, May 3, 2010 at 2:54 PM, Mikael Pettersson <mikpe@REDACTED> wrote:
> Chetan Ahuja writes:
> > Hi,
> >
> > We hit a bug while running rabbitmq where the beam.smp process was
> stuck
> > in a tight loop in the erts_poll_info method.
> > The process was eating up 100% of exactly one core (on a multi core box)
> and
> > rabbitmq was dysfunctional. Unfortunately
> > I could not create a small test case to reproduce this condition but it
> > would happen quite frequently while rabbitmq was in
> > operation.
> >
> > The C code for the function didn't provide any hints on what would have
> been
> > spinning in that function
> > (first time looking at this codebase though). Finally looking through
> the
> > disassembly in gdb, (at the point of where our process was spinning) I
> saw
> > the following lines in the
> > erts_poll_info_kp method:
> >
> >
> > 0x00000000004f0fe9 <erts_poll_info_kp+185>: nopl 0x0(%rax)
> > 0x00000000004f0ff0 <erts_poll_info_kp+192>: jmp 0x4f0fe9
> > <erts_poll_info_kp+185>
> >
> > (Similar assembly code can be seen when the KERNEL_POLL option is
> > disabled.)
> >
> > Clearly the above will trivially spin forever anytime we get into that
> > codepath. The above
> > looks suspiciously like some code got optimized out by the compiler
> leaving
> > the crazy
> > loop code.
> >
> > So I compiled with -O1 and then with no optimization at all. Withe
> -O1, I
> > saw a
> > a weird jmp insruction jumping to it's own address:
> >
> > 0x0000000000517102 <erts_poll_info_kp+60>: jmp 0x517102
> > <erts_poll_info_kp+60>
> >
> > With no optimization, any of those trivial spins did not exist but I
> > didn't analyze the unoptimized
> > code enough to say whether it can be proven to have an infinite loop
> (i.e.,
> > whether the optimizing
> > compiler is simply doing it's job vs. this being a compiler bug).
> >
> > Anyway, this problem exists at least since erlang-base_12.b.3-dfsg
> debian
> > package version and has been
> > verified to exists in the github version as of today.
> >
> >
> > Her'es the gcc and debian version info:
> > $ gcc --version
> > gcc-4.3.real (Debian 4.3.2-1.1) 4.3.2
> > Copyright (C) 2008 Free Software Foundation, Inc.
>
> I looked at the procedure in question (not so easy to locate due to
> some "creative" C preprocessor abuse), and noticed an obvious bug:
> there's a loop over a linked list that forgets to actually advance
> the node pointer to the next element. When optimizing, gcc will notice
> that the loop doesn't terminate, omit the body of the loop (the
> calculations are dead), which will result in the type of object code
> shown above. Thus, it's an Erlang VM bug not a gcc miscompilation.
>
> Try the patch below and let us know if it solves your problem.
>
> /Mikael
>
> --- otp_src_R13B03/erts/emulator/sys/common/erl_poll.c.~1~ 2009-03-12
> 13:16:29.000000000 +0100
> +++ otp_src_R13B03/erts/emulator/sys/common/erl_poll.c 2010-05-03
> 23:41:32.000000000 +0200
> @@ -2404,6 +2404,7 @@ ERTS_POLL_EXPORT(erts_poll_info)(ErtsPol
> while (urqbp) {
> size += sizeof(ErtsPollSetUpdateRequestsBlock);
> pending_updates += urqbp->len;
> + urqbp = urqbp->next;
> }
> }
> #endif
>
More information about the erlang-bugs
mailing list