[erlang-questions] Strange bus error 7 with HiPE R12B-5 + patch

Mikael Pettersson mikpe@REDACTED
Tue Feb 24 15:12:32 CET 2009


Paul Fisher writes:
 > I sent this to erlang-bugs last night and it has not shown up as of 
 > yet... Trying to erlang-questions in the hopes that someone can still 
 > look and give some guidance/assistance today.
 > 
 > thx
 > 
 > Paul Fisher wrote:
 > > We have been having strange core dumps happen occasionally in our 
 > > environment, most of which end up with a stack trace like the following:
 > > 
 > > (gdb) where
 > > #0  0x0000000045050b39 in ?? ()
 > > #1  0x00002aaaabe25922 in ?? ()
 > > #2  0x000000000000078f in ?? ()
 > > #3  0x0000000000000000 in ?? ()
 > > 
 > > This happens on thread 1... the one that ends up running 
 > > erts_sys_main_thread().  Pretty weird.
 > > 
 > > Today, i got a core dump on the same thread, while it was running 
 > > gensweep_nstack().  What follows is my brief trolling through the dump 
 > > and just following things until it was clear how we ended up at the 
 > > problem.
 > > 
 > > I would love if someone could help further the diagnosis beyond this 
 > > point, because I'm not sure where to look beyond this.  Without HiPE 
 > > compilation (which we do on most of our modules) these problems do not 
 > > occur, so it does seem to point to a HiPE related issue.
 > > 
 > > The environment is a 4 core, Core 2 Q6600, 4 G ECC memory with the 
 > > emulator running SMP and running 64-bit.
 > > 
 > > cluster-14:/var/alertlogic# uname -a
 > > Linux cluster-14 2.6.24-etchnhalf.1-amd64 #1 SMP Fri Dec 26 03:26:12 UTC 
 > > 2008 x86_64 GNU/Linux
 > > 
 > > Anyway, here is the gdb session:
 > > 
 > > Core was generated by `/usr/lib/erlang/erts-5.6.5/bin/beam.smp -Ktrue -W 
 > > w -A 32 -a 128 -d -- -root /u'.
 > > Program terminated with signal 7, Bus error.
 > > #0  gensweep_nstack (p=0x2aaaade37808, ptr_old_htop=0x44048b28,
 > >      ptr_n_htop=0x44048b20) at hipe/hipe_stack.h:70
 > > 70	    if (likely(sdesc->bucket.hvalue == ra))
 > > (gdb) where
 > > #0  gensweep_nstack (p=0x2aaaade37808, ptr_old_htop=0x44048b28,
 > >      ptr_n_htop=0x44048b20) at hipe/hipe_stack.h:70
 > > #1  0x00000000004bfc35 in minor_collection (p=0x2aaaade37808, need=2,
 > >      objv=0x0, nobj=0, recl=0x44048e68) at beam/erl_gc.c:893
 > > #2  0x00000000004c0761 in erts_garbage_collect (p=0x2aaaade37808, need=2,
 > >      objv=0x0, nobj=0) at beam/erl_gc.c:374
 > > #3  0x000000000050ae1f in hipe_gc (p=0x1b9b860, need=46912528116752)
 > >      at hipe/hipe_native_bif.c:69
 > > #4  0x000000000050be74 in nbif_gc_1 ()
 > >      at x86_64-unknown-linux-gnu/opt/smp/hipe_amd64_bifs.S:540
 > > #5  0x00002aaaade37808 in ?? ()
 > > #6  0x00002aaaade37a80 in ?? ()
 > > #7  0x0000000000000007 in ?? ()
 > > #8  0x00002aaaaaaed9c0 in ?? ()
 > > #9  0x00002aaaabe8f0c8 in ?? ()
 > > #10 0x00002aaaabe937d8 in ?? ()
 > > #11 0x00002aaaabe937d8 in ?? ()
 > > #12 0x0000000000509ce4 in hipe_mode_switch (p=0x2aaaade37808, 
 > > cmd=2895309840,
 > >      reg=0x2aaaaaaed9c0) at hipe/hipe_x86_glue.h:196
 > > #13 0x00000000004dd97b in process_main () at beam/beam_emu.c:4681
 > > #14 0x000000000048100f in sched_thread_func (vesdp=<value optimized out>)
 > >      at beam/erl_process.c:752
 > > #15 0x0000000000549f24 in thr_wrapper (vtwd=<value optimized out>)
 > > ---Type <return> to continue, or q <return> to quit---
 > >      at common/ethread.c:474
 > > #16 0x00002afcecf9ef1a in start_thread () from /lib/libpthread.so.0
 > > #17 0x00002afced2815d2 in sysctl () from /lib/libc.so.6
 > > #18 0x0000000000000000 in ?? ()

Yes it looks like something in HiPE.
However, without a test case there is essentially nothing
anyone can do to debug it.



More information about the erlang-questions mailing list