Help with NIF upgrade implementation

Sverker Eriksson sverker.eriksson@REDACTED
Fri Sep 11 18:57:51 CEST 2020


Each call to erlang:load_nif/2 corresponds to a call to dlopen() (on unix).

dlopen will load code and static data [1] in to memory. The code and its static data are tightly coupled. You can’t load or unload one without the other.

 

dlclose() is called when an Erlang module with NIF library is purged [2]. If it is the last reference to the library, code and static data will be deallocated.

 

If two Erlang module instances (like “current” and “old” generation of same module) loads the same NIF library instance, then they will share the code and static data of that NIF library. In practice the second dlopen call will see that the library is already loaded and do nothing (except increase some internal reference counter I guess).

 

You can detect shared static data by having a static reference counter of your own that you increase in load/upgrade callback and decrease in unload callback.

 

The only other way to share static data (that I can thing of) is for two different NIF libraries to use a common shared library with static data.  This can either be done with dynamically linking or calling dlopen at runtime.

 

The Erlang VM does not do any spooky magic to let one NIF library inherit static data from another.

 

If you want shared NIF data to be inherited from “old” to “current” NIF library instance, then the recommended way is to allocate it dynamically (enif_alloc) and use the “priv_data” feature to hand over the data in the upgrade callback.

 

NIF resource types are inherited from “old” instance by calling enif_open_resource_type with ERL_NIF_RT_TAKOVER.

 

One thing to look out for is that dlopen() (depending on OS) may identify the library by its file name only. That is, even if the .so file have been rebuilt with new code dlopen may think it’s the same as the old one already loaded and do nothing.

 

[1] Static data: variables declared as “static” or at global scope in a C program.

[2] The call to dlclose() for a NIF library will be delayed after purge until all resource objects, with destructor or other callback functions implemented in that library, have been garbage collected.

 

/Sverker, Erlang/OTP

 

From: Harris, Robert <robert.harris@REDACTED> 
Sent: den 10 september 2020 23:35
To: Sverker Eriksson <sverker.eriksson@REDACTED>
Cc: erlang-questions@REDACTED Questions <erlang-questions@REDACTED>
Subject: Re: Help with NIF upgrade implementation

 

Hi Sverker,

Many thanks for pointing this out.

I spent some time trying to create a NIF library that would be reloaded
but without any success. Are you aware of any examples?

I'm specifically interested in the implication of comments here and in
the documentation that a new version can inherit existing static data.
What exactly does that mean? If I

1. execute some function in a NIF library, thereby loading the
shared object,
2. delete the shared object,
3. compile a new version of the NIF library and
4. run 'c(<module>)'

then do I expect to see additional mappings for the text sections of the
new shared object but with no data relocations applied or...something else?

Regards,

Robert



> On 3 Sep 2020, at 11:39, Sverker Eriksson <sverker.eriksson@REDACTED <mailto:sverker.eriksson@REDACTED> > wrote:
>
> Hi Robert,
>
> Your NIF rtc_upgrade callback function is missing a call to enif_open_resource_type. You can basically do the same call as in rtc_load.
>
> The upgrade callback is called when there already exists a loaded instance of the NIF module, which is the case in your second and third calls in the shell to c(rtc).
>
> enif_open_resource_type has to be called (in load or upgrade) for each resource type that is going to be used by that logical NIF module instance. If not, the resource type will be deallocated when all resource objects are gone.
>
> /Sverker, Erlang/OTP
>
>
> From: erlang-questions <erlang-questions-bounces@REDACTED <mailto:erlang-questions-bounces@REDACTED> > On Behalf Of Harris, Robert
> Sent: den 26 augusti 2020 21:16
> To: erlang-questions@REDACTED <mailto:erlang-questions@REDACTED>  Questions <erlang-questions@REDACTED <mailto:erlang-questions@REDACTED> >
> Subject: Help with NIF upgrade implementation
>
> I observe a reproducible SEGV of the 21.3.8.4 VM on macos Catalina when
> repeatedly recompiling a module that has a skeletal NIF with an upgrade
> callback. The skeleton is based on the examples in the documentation
> but I am assuming that I have made an error somewhere. I would be
> grateful for any pointers.
>
> The very simple NIF library is attached, together with its makefile and
> the Erlang module's source. Once built, the failure is seen very
> quickly:
>
> $ erl
> Erlang/OTP 21 [erts-10.3.5.3] [source] [64-bit] [smp:12:12] [ds:12:12:10] [async-threads:1] [hipe]
>
> Eshell V10.3.5.3 (abort with ^G)
> 1> c(rtc), rtc:foo(), rtc:bar().
> ok
> 2> c(rtc), rtc:foo(), rtc:bar().
> ok
> 3> c(rtc), rtc:foo(), rtc:bar().
> Segmentation fault: 11
> $
>
> The stack is
>
> * thread #10, name = '6_scheduler', stop reason = EXC_BAD_ACCESS (code=1, address=0x70)
> frame #0: 0x000000001e41bc0e beam.smp`nif_resource_dtor at atomic.h:240
> 237 ETHR_AINT_T__ tmp;
> 238
> 239 tmp = incr;
> -> 240 __asm__ __volatile__(
> 241 "lock; xadd" ETHR_AINT_SUFFIX__ " %0, %1" /* xadd didn't exist prior to the 486 */
> 242 : "=r"(tmp)
> 243 : "m"(var->counter), "0"(tmp)
>
> * thread #10, name = '6_scheduler', stop reason = EXC_BAD_ACCESS (code=1, address=0x70)
> * frame #0: 0x000000001e41bc0e beam.smp`nif_resource_dtor at atomic.h:240
> frame #1: 0x000000001e41bc07 beam.smp`nif_resource_dtor
> frame #2: 0x000000001e41bc07 beam.smp`nif_resource_dtor
> frame #3: 0x000000001e41bc07 beam.smp`nif_resource_dtor
> frame #4: 0x000000001e41bc07 beam.smp`nif_resource_dtor
> frame #5: 0x000000001e41bbf0 beam.smp`nif_resource_dtor(bin=0x000000001fc800f8)
> frame #6: 0x000000001e3d390d beam.smp`sweep_off_heap(p=0x0000000020740778, fullsweep=<unavailable>) at erl_binary.h:453
> frame #7: 0x000000001e3d73f7 beam.smp`do_minor(p=0x0000000020740778, live_hf_end=<unavailable>, mature=<unavailable>, mature_size=696, new_sz=376, objv=<unavailable>, nobj=3) at erl_gc.c:1674
> frame #8: 0x000000001e3d9af4 beam.smp`garbage_collect(p=0x0000000020740778, live_hf_end=0xfffffffffffffff8, need=4, objv=0x000000001f7d4440, nobj=3, fcalls=3047, max_young_gen_usage=0) at erl_gc.c:1417
> frame #9: 0x000000001e3dabb5 beam.smp`erts_garbage_collect_nobump(p=0x0000000020740778, need=<unavailable>, objv=<unavailable>, nobj=<unavailable>, fcalls=3047) at erl_gc.c:878
> frame #10: 0x000000001e4e8097 beam.smp`process_main(x_reg_array=0x000000001f7d4440, f_reg_array=<unavailable>) at beam_hot.h:104
> frame #11: 0x000000001e287916 beam.smp`sched_thread_func(vesdp=0x0000000020bb3400) at erl_process.c:8469
> frame #12: 0x000000001e4cbd52 beam.smp`thr_wrapper(vtwd=0x00007ffee197c140) at ethread.c:118
> frame #13: 0x00007fff692aa109 libsystem_pthread.dylib`_pthread_start + 148
> frame #14: 0x00007fff692a5b8b libsystem_pthread.dylib`thread_start + 15
>
> Robert Harris
>
> Confidentiality Notice | This email and any included attachments may be privileged, confidential and/or otherwise protected from disclosure. Access to this email by anyone other than the intended recipient is unauthorized. If you believe you have received this email in error, please contact the sender immediately and delete all copies. If you are not the intended recipient, you are notified that disclosing, copying, distributing or taking any action in reliance on the contents of this information is strictly prohibited.
>
>
> Disclaimer
>
> The information contained in this communication from the sender is confidential. It is intended solely for use by the recipient and others authorized to receive it. If you are not the recipient, you are hereby notified that any disclosure, copying, distribution or taking action in relation of the contents of this information is strictly prohibited and may be unlawful.
>
> This email has been scanned for viruses and malware, and may have been automatically archived by Mimecast, a leader in email security and cyber resilience. Mimecast integrates email defenses with brand protection, security awareness training, web security, compliance and other essential capabilities. Mimecast helps protect large and small organizations from malicious activity, human error and technology failure; and to lead the movement toward building a more resilient world. To find out more, visit our website.
>

Confidentiality Notice | This email and any included attachments may be privileged, confidential and/or otherwise protected from disclosure. Access to this email by anyone other than the intended recipient is unauthorized. If you believe you have received this email in error, please contact the sender immediately and delete all copies. If you are not the intended recipient, you are notified that disclosing, copying, distributing or taking any action in reliance on the contents of this information is strictly prohibited.



Disclaimer

The information contained in this communication from the sender is confidential. It is intended solely for use by the recipient and others authorized to receive it. If you are not the recipient, you are hereby notified that any disclosure, copying, distribution or taking action in relation of the contents of this information is strictly prohibited and may be unlawful.

This email has been scanned for viruses and malware, and may have been automatically archived by Mimecast, a leader in email security and cyber resilience. Mimecast integrates email defenses with brand protection, security awareness training, web security, compliance and other essential capabilities. Mimecast helps protect large and small organizations from malicious activity, human error and technology failure; and to lead the movement toward building a more resilient world. To find out more, visit our website.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20200911/38b2b027/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 5509 bytes
Desc: not available
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20200911/38b2b027/attachment.bin>


More information about the erlang-questions mailing list