fix native code crash when calling unloaded module with on_load function
Mikael Pettersson
mikpe@REDACTED
Wed Jun 30 15:24:22 CEST 2010
Paul Guyot writes:
> Le 30 juin 2010 à 14:31, Mikael Pettersson a écrit :
>
> > As reported in erlang-bugs, the following sequence of events crashes the VM:
> >
> > 1. Module M1 is loaded and in native mode.
> > 2. Module M2 is not loaded, in emulated mode, and has an on_load function.
> > 3. M1 calls some function in M2. This works.
> > 4. M1 again calls some function in M2. This segfaults.
> >
> > The reason for the crash is that when the beam loader fixes up export
> > entries after a successful on_load function call, it erroneously clears
> > the ->code[3] field in that module's export entries. This is redundant
> > (no code in beam relies on ->code[3] being NULL), inconsistent with
> > modules without on_load functions (there ->code[3] remains a valid beam
> > instruction after the module is loaded), and breaks native code which needs
> > the old ->address value in an export entry to remain valid after a module
> > load step (before the load ->address points to ->code[3], after the load
> > ->address points to the real code but uses of the old ->address value
> > remain so ->code[3] must remain valid).
> >
> > Thus the fix for the crash is to simply not clear ->code[3].
> > This patch fixes R14A and should also fix R13B04.
> >
> > (There does exist a performance bug in this area, but it is unrelated
> > to the on_load feature so will be fixed separately.)
>
> Hello Mikael,
>
> Did you have a chance to check the patch I submitted to erlang-patch and which is available here:
>
> http://github.com/pguyot/otp/commit/495804b097aea4015e218d7b5da8d1372395580c
I did. It's way overkill for this specific bug.
> My impression is that if we do not clear ep->code[3], this still points to call_error_handler. Instead of not clearing the value and relying to its initial assignation, I replaced the beam instruction to call_error_handler with a new beam instruction (call_from_hipe_stub) that simply jumps to the function.
Your impression is correct, but it's not necessary to invent a new BEAM instruction
to solve the on_load crash bug.
Your approach is more related to my comment:
> > (There does exist a performance bug in this area, but it is unrelated
> > to the on_load feature so will be fixed separately.)
But the performance bug exists whether or not on_load is used. And the
solution I have in mind is very different from yours'.
/Mikael
More information about the erlang-patches
mailing list