[erlang-bugs] hipe crash with compiler modules

Mikael Pettersson mikpe@REDACTED
Tue Nov 3 23:57:51 CET 2009


Paul Guyot writes:
 > Hello,
 > 
 > I have been experiencing a random crash with hipe on FreeBSD 32bits  
 > (R13B) and MacOS X 10.6 64bits (R13B01) when compiler modules have  
 > been recompiled with native code.

64-bit native code on OSX has not been validated by the HiPE group,
so it is unsupported. 32-bit native code on OSX 10.5 seems to work,
but has been only very lightly tested by us.

 > The compiler modules have been recompiled with some code that goes  
 > like this :
 > 
 >              {_, Beam, Path} = code:get_object_code(Module),
 >              {ok, _, Chunks} = beam_lib:all_chunks(Beam),
 >              {ok, {Target, HipeBinary}} = hipe:compile(Module),
 >              ChunkName = hipe_unified_loader:chunk_name(Target),
 >              {ok, NewBeam} = beam_lib:build_module(Chunks ++  
 > [{ChunkName, HipeBinary}]),

The proper way to compile modules is to pass 'native' as
an option to the BEAM compiler. I do not consider hipe:compile
or hipe_unified_loader:chunk_name to be public APIs.

So why do you do it in this awkward way?

 > The crash happens when I compile several files (a dozen) at once with  
 > a rpc:pmap. I believe the rpc:pmap is the reason why the crash happens  
 > randomly. This is with an internal tool called erl_make. If I run  
 > erl_make clean && erl_make install, I get a crash, but if I do  
 > erl_make install; erl_make install, the second operation (almost  
 > always) succeeds. Or sometimes, I need to run erl_make clean to  
 > successfully compile with erl_make install.
 > 
 > The stack trace (on MacOS X) looks like this :
 > 
 > Thread 4 Crashed:
 > 0   beam.smp                      	0x000000000055dc0f gensweep_nstack  
 > + 623
 > 1   beam.smp                      	0x00000000004e5591 do_minor + 313
 > 2   beam.smp                      	0x00000000004e4ef9 minor_collection  
 > + 547
 > 3   beam.smp                      	0x00000000004e34f4  
 > erts_garbage_collect + 590
 > 4   beam.smp                      	0x00000000004e31de  
 > erts_gc_after_bif_call + 153
 > 5   beam.smp                      	0x000000000051acee process_main +  
 > 42816
 > 6   beam.smp                      	0x000000000047a833  
 > sched_thread_func + 357
 > 7   beam.smp                      	0x000000000059ca27 thr_wrapper + 103
 > 8   libSystem.B.dylib             	0x00007fff86da4f66 _pthread_start +  
 > 331
 > 9   libSystem.B.dylib             	0x00007fff86da4e19 thread_start + 13
 > 
 > If all compiler beam files are replaced with the original ones (i.e.  
 > without the hipe chunk), there is no crash. I couldn't single out a  
 > compiler module that causes the crash. It looks like that if several  
 > of them are native, the crash does happen.
 > 
 > I found a reference to a crash in gensweep_nstack in the archives :
 > http://erlang.org/pipermail/erlang-bugs/2008-December/001131.html
 > 
 > In this case, the code that gets compiled natively is just part of  
 > OTP. Do you have any hint about what can be done to track down the bug ?

There is a known problem with concurrent invokations of the HiPE compiler.
It looks like the serialization of code loading that the BEAM loader is
supposed to do isn't happening, or it is bypassed. This corrupts certain
runtime system data structures causing crashes during GC. I'm currently
trying to debug this problem.


More information about the erlang-bugs mailing list