[erlang-bugs] Unloading then reloading crypto.so causes erlang core dump on Solaris 11

Sverker Eriksson sverker.eriksson@REDACTED
Fri Sep 7 15:03:31 CEST 2012


XinFeng Liu wrote:
> This I don't understand at all.
>
>     reload() in crypto.c should only be called if the module is upgraded, not if
>     it is unloaded and later loaded again.
>
>     /Sverker, Erlang/OTP
>
> Thanks for pointing this. I'm sorry I made a wrong conclusion for the workaround 
> in modifying reload() function. I wronly modfied reload() in 15B02.
>
>   
Good to know.

> I have some new findings on this issue:
> I find the latest 15B02 (without any modification) does not cause core dump in 
> running couchDB test suite. Digging this issue further, there's a subtle 
> difference between 15B02 and 15B01 in crypto.so.
> In 15B01, crypto.so explicitly link libssl.so, while in 15B02 it does not.
>
>   
 From R15B02 README:

OTP-10064  Remove unnecessary dependency to libssl from crypto NIF
	      library. This dependency was introduced by accident in
	      R14B04.


> And more importantly, the libssl.so built by Sun/Oracle seems built with "-z 
> nodelete" meaning RTLD_NODELETE. ("elfdump -d" can show that).
> In 15B01, loading crypto.so causes libssl.so to be loaded, since libssl.so 
> depends on libcrypto.so,  libcrypto.so is somehow promoted to RTLD_NODELETE 
> (using solaris runtime LD debugger can show this). So, libcrypto.so is 
> unloadable in dlclose().
> In 15B02, when running couchDB test suite, unloading crypto.so causes 
> libcrypto.so unloaded too, then later reloading both crypto.so and libcrypto.so 
> would not cause previous problem.
>
>   
Ok, so it seems that removing the unnecessary dependency to libssl.se in 
R15B02 happened to act as a workaround for your problem as that caused 
libcrypto.so to be unloaded and thereby forgetting its old obsolete 
callbacks into crypto.

> A new question, each time loading crypto.so will cause load() to be called, then 
> it means CRYPTO_set_mem_functions() should be called again, I assume it should 
> correctly set the callback funcs. But from instruction-level tracing and the 
> src, it simply returned in line 129 because "!allow_customize" is true.
>
> (dbx) stepi
> t@REDACTED (l@REDACTED) stopped in CRYPTO_set_mem_functions at 0xfd053b0c
> 0xfd053b0c: CRYPTO_set_mem_functions+0x0034:    retl    
>
>     125 int CRYPTO_set_mem_functions(void *(*m)(size_t), void *(*r)(void *, 
> size_t),   
>     126         void (*f)(void *))
>     127         {
>     128         if (!allow_customize)
>     129                 return 0;
>     130         if ((m == 0) || (r == 0) || (f == 0))
>     131                 return 0;
>     132         malloc_func=m; malloc_ex_func=default_malloc_ex;
>     133         realloc_func=r; realloc_ex_func=default_realloc_ex;
>     134         free_func=f;
>     135         malloc_locked_func=m; 
> malloc_locked_ex_func=default_malloc_locked_ex;  
>     136         free_locked_func=f;
>     137         return 1;
>     138         }
>   
Yes, I've seen this to. It looks like a simple way that openssl uses to 
"protect" itself against re-registering of callbacks that it does not 
support. Looking at the openssl code, 'allow_customize' seems to be set 
to false at the first memory allocation. This means that if you try to 
(re)set memory callbacks *after* doing something that allocated memory, 
CRYPTO_set_mem_functions() will fail by returning 0. crypto.c ignores 
this failure however and that's part of this problem I guess.

I have written a work ticket regarding the more general problem of NIF 
libraries that do not support unloading.
Your solution seems to be running R15B02.
Another even safer solution could be to build crypto with static linking 
to openssl (configure flag --disable-dynamic-ssl-lib).

/Sverker, Erlang/OTP




More information about the erlang-bugs mailing list