fix native code crash when calling unloaded module with on_load function

Mikael Pettersson mikpe@REDACTED
Wed Jun 30 15:24:22 CEST 2010


Paul Guyot writes:
 > Le 30 juin 2010 à 14:31, Mikael Pettersson a écrit :
 > 
 > > As reported in erlang-bugs, the following sequence of events crashes the VM:
 > > 
 > > 1. Module M1 is loaded and in native mode.
 > > 2. Module M2 is not loaded, in emulated mode, and has an on_load function.
 > > 3. M1 calls some function in M2. This works.
 > > 4. M1 again calls some function in M2. This segfaults.
 > > 
 > > The reason for the crash is that when the beam loader fixes up export
 > > entries after a successful on_load function call, it erroneously clears
 > > the ->code[3] field in that module's export entries.  This is redundant
 > > (no code in beam relies on ->code[3] being NULL), inconsistent with
 > > modules without on_load functions (there ->code[3] remains a valid beam
 > > instruction after the module is loaded), and breaks native code which needs
 > > the old ->address value in an export entry to remain valid after a module
 > > load step (before the load ->address points to ->code[3], after the load
 > > ->address points to the real code but uses of the old ->address value
 > > remain so ->code[3] must remain valid).
 > > 
 > > Thus the fix for the crash is to simply not clear ->code[3].
 > > This patch fixes R14A and should also fix R13B04.
 > > 
 > > (There does exist a performance bug in this area, but it is unrelated
 > > to the on_load feature so will be fixed separately.)
 > 
 > Hello Mikael,
 > 
 > Did you have a chance to check the patch I submitted to erlang-patch and which is available here:
 > 
 > http://github.com/pguyot/otp/commit/495804b097aea4015e218d7b5da8d1372395580c

I did.  It's way overkill for this specific bug.

 > My impression is that if we do not clear ep->code[3], this still points to call_error_handler. Instead of not clearing the value and relying to its initial assignation, I replaced the beam instruction to call_error_handler with a new beam instruction (call_from_hipe_stub) that simply jumps to the function.

Your impression is correct, but it's not necessary to invent a new BEAM instruction
to solve the on_load crash bug.

Your approach is more related to my comment:
 > > (There does exist a performance bug in this area, but it is unrelated
 > > to the on_load feature so will be fixed separately.)

But the performance bug exists whether or not on_load is used.  And the
solution I have in mind is very different from yours'.

/Mikael


More information about the erlang-bugs mailing list