[erlang-questions] Reference counting in NIFs
Sverker Eriksson
sverker.eriksson@REDACTED
Thu Aug 15 11:53:03 CEST 2013
ANTHONY MOLINARO wrote:
> Hi,
>
> So I've been using re2 for some time now. I want to compile large regexes and use them from multiple processes. This however leads to a bottleneck because I need to have a gen_server which holds onto the NIF resource returned from the call to re2:compile/2 and pass it to calls to re2:match/3. The re2 engine runs in a few microseconds, but the gen_server message queue can take milliseconds and often under high concurrency it's message queue grows, and it takes even longer.
>
> The idea I had was to add an option to re2:compile/2 called {named_pattern, atom()} which when given an atom() keeps a copy of the re2 object on the C++ side (in a map<> at the moment), and then allows you to pass the atom() as the regex argument to re2:match/3. This works fine except in the case where you want to recompile the regex for the name (this happens every so often).
>
> Sometimes the recompile works and other times it seg faults. I'm pretty sure the segfaults occur during the switching, since a call like
>
> old_handle = named_patterns[copts.name];
> named_patterns[copts.name] = new_handle;
> enif_release_resource (old_handle);
>
> occurring in one thread can conflict with another thread calling
>
> handle = named_patterns[copts.name];
> …
> // use re2 from within handle
>
> since the handle could be released (and I assume freed).
>
> So, I've been trying various things, but ideally, I'd just use the resource reference counting and the GC to make sure I don't leak memory. However, it's a little clear when the references are incremented and decremented from the example in the documentation.
>
> The documentation seems to suggest that calling enif_make_resource will add to the reference count, and that enif_release_resource will decrement from the reference count. Also it appears as though enif_keep_resource will also increment, but I'm not sure if enif_alloc_resource also increments (the documentation doesn't mention it).
>
> I tried a scheme were I call enif_keep_resource at the beginning of the re2:match/3 call, and enif_release_resource at the end to attempt to keep around the resource, but I still see some segfaults.
>
> So I guess my questions are.
>
> 1. which enif_* functions change the reference count, and how?
>
alloc_resource: ref=1
release_resource: if (--ref == 0) free
make_resource: ++ref
keep_resource: ++ref
keep_resource and release_resource can also be called by the VM when
resource terms are copied or destroyed.
> 2. is there a better/safer way?
>
Why don't use a public ETS table as a name lookup for your compiled regexps.
If you have to access shared data in NIFs from several threads then I
suggest using the enif_mutex_* interface to do make it thread safe.
/Sverker, Erlang/OTP
More information about the erlang-questions
mailing list