[erlang-bugs] NIF .so reload issues

Sverker Eriksson sverker.eriksson@REDACTED
Fri Dec 18 17:41:24 CET 2015


Hi David,

Yes, this is a dlopen restriction and also an ambiguity as I've heard
different behaviour reported depending on OS.

My Linux man page for dlopen says "If the same library is loaded again 
with dlopen(),
the same file handle is returned". But it does not specify what "the 
same" actually means.

The Erlang VM has to keep the old .so file loaded until the module is 
safely purged [*]
as there may exist Erlang processes still lingering in the old code. 
Trying to execute
unloaded native code does not behave well.

When you call load_nif with the same library name (as the
not yet purged one), dlopen thinks it's "the same" library
and just returns the same handle again.

What to do?

Rename the .so library, give it a version number. Or maybe
put it in a different directory will work (?).

Add something about this problem to the erl_nif docs. Yes that would be 
nice.

I'm hesitant to recommend purging in on_load. The on_load feature
is still experimental and we have some known problems with bad
behaviour, especially in the error cases when on_load fails.
To fix that we may have to limit what you are allowed
to do in on_load and code purging might be such a limitation.


[*] Purging may actually not be enough. If the NIF library has created
resource objects with a destructor callcack, it will not be unloaded until
the last resource objects has been garbage collected.

/Sverker, Erlang/OTP


On 12/18/2015 03:19 PM, David Buckley wrote:
> Hi! I was playing with writing a NIF, and found I couldn't reload.
>
> I'm doing the sort-of accepted thing of loading the nif in an on_load
> function, though if I just execute the function just after load, I get
> the same behaviour, so I don't think that's at issue.
>
> Basically, what seems to be the case is that while erlang will
> re-initialise my nif code (with 'upgrade'), it won't load a /new/
> version of the nif code unless I completely purge the (erlang) code from
> the runtime, forcing erlang to recheck the module. I'm guessing erlang
> is caching the nif. Changing the compiled (.so) filename each time fixes
> the problem.
>
>
> Example code here:
>
> https://gist.github.com/bucko909/a3b5099c74bf267e65db
>
> test_reload_post_purge and test_reload_post_reload_complete_purge work
> fine (erts-7.1), but the other three don't reload the .so file as I
> would expect.
>
>
> Is this fixable, or must I manually add a purge() in my init() function
> before load_nif? (And why does that work? Because at that point there's
> no evidence that the new module will have a load_nif, so the old dlopen
> can be discarded?)
>
> Seems like in general if the .so file has changed and a module is
> reloaded, the user probably wants the new .so file, too! It's at least
> worth adding a note to the docs (or a new return value?) if it's an evil
> dlopen restriction.
>




More information about the erlang-bugs mailing list