[erlang-bugs] NIF resources are not checked on module unload
David Buckley
isreal-erlang-bugs-at-erlang.org@REDACTED
Sun Dec 20 20:48:42 CET 2015
While playing with implementing a NIF, I found some segfaults, and I
eventually got it down to the test case here:
https://gist.github.com/bucko909/a841c716ede6d3903a13
It looks like it's down to my not re-registering the resource on upgrade
(presumably the handle goes stale, is garbage collected, and eventually
it corrupts memory causing segfaults in unrelated emulator code).
I fell into this trap by using code from
https://github.com/davisp/nif-examples -- which I've sent a pull request
to fix.
I fixed my problem by adding enif_open_resource to the upgrade function
once I'd clocked my error, so under normal and correct use, I think the
emulator is doing OK.
However, it looks like if I /don't/ reopen it, it's not properly
deleted, and the documentation seems to leave open the possibility of
doing just this ("Existing resource objects, of a module that is
upgraded, must either be deleted or taken over by the new NIF library").
References to resources with the old handle remain uncleaned. Even if I
completely destroy the old module, so that unload is called, these stale
resources persist until a garbage collection. They actually survive
/many/ purge/load cycles in my example code before being garbage
collected and segfaulting the emulator.
Ideas, based on my interpretation of the bug:
If there are lingering resources, which are not TAKEOVER-ed in the
upgrade function, and have a dtor, this should cause an immediate
emulator panic. I can't think of any other behaviour which is safe here.
If they don't have a dtor, it seems safe to keep them around, but their
resource handle needs to be kept alive until they are all destroyed. It
ought to be impossible to create new resources using the old handle, at
least when there is a dtor defined (can a 'dead' flag be set?).
Knowing this behaviour, an application author writing an upgrade
function for this NIF library might at least attempt to destroy all of
his objects when making such an upgrade, in order to have the emulator
survive!
Another approach is to require an /explicit/ delete of old resources,
perhaps simply a call to "enif_delete_unused_resources" or an iteration
of "enif_delete_resource" over "enif_list_resources", and have this call
fail where the old resources are still allocated. Perhaps the library
author could force a purge or panic the emulator themselves at this
point. The emulator should panic if a resource is neither deleted nor
reopened with TAKEOVER.
--
David Buckley
More information about the erlang-bugs
mailing list