[erlang-bugs] Segfault on amd64 Debian with HiPE

Mikael Pettersson mikpe@REDACTED
Tue Dec 16 10:14:19 CET 2008


Colm Dougan writes:
 > Hi,
 > 
 > I have a reproducible segfault when running some stuff on an amd64
 > Debian where some of the modules involved are compiled with HiPE.   I
 > can get around the problem by not compiling one particular module
 > natively.  I don't seem to get the same behavior on an i386 Debian
 > with the same code.
 > 
 > Unfortunately the problem seems to be a very subtle interaction
 > between many different modules in our system, some of which are
 > compiled natively and others which are not, which makes it rather
 > difficult to boil doing to a stand-alone test script I can send you.
 > However, I'd be happy to work (probably off-list) with anyone who
 > wants more information.
 > 
 > To give a brief, and I appreciate rather vague, synopsis of the code:
 > the offending module does a pattern on a binary and produces smaller
 > binaries which are then passed into another module which does more
 > binary chopping up and eventually hands off to another process which
 > does ets/mnesia inserts with data.  It seems that the segfault happens
 > in the GC code somewhere in the final part of that process.

Without a test case which allows us to reproduce the problem
it's going to be very difficult for us to diagnose it.

Having said that, a segfault in gensweep_nstack() is most likely
caused by the compiler assigning an incorrect type to a temporary
register: only one type of temps can refer to Erlang terms, and
only those temps are traced during garbage collection. A incorrect
type assignment can cause a live term to be collected or a non-term
to be traced.

But to debug this we'd need to see the actual Erlang code which
gets miscompiled.



More information about the erlang-bugs mailing list