<html>
<head>
<meta content="text/html; charset=ISO-8859-1"
http-equiv="Content-Type">
</head>
<body text="#000000" bgcolor="#FFFFFF">
Hi!<br>
<br>
On 10/19/2012 10:42 AM, adam chan wrote:
<blockquote cite="mid:tencent_688EC67B109936B048868B39@qq.com"
type="cite">
<meta http-equiv="Content-Type" content="text/html;
charset=ISO-8859-1">
<font>
<div>Hi!</div>
<div><br>
</div>
<div>I don't think the crypto NIF library was reloaded, though
the mysql client(Author: Magnus Ahltorp
<a class="moz-txt-link-rfc2396E" href="mailto:ahltorp@nada.kth.se"><ahltorp@nada.kth.se></a>) in my project does use crypto
library. </div>
<div><br>
</div>
<div>When I upgrade a big data file(600K) which only includes
thousands of lines like:</div>
<div> get(xxx) -> #record{a = xx, b = xx};</div>
<div>, the crash happens. I guess this data file has no
relationship with crypto NIF library? </div>
<div><br>
</div>
</font></blockquote>
Oups, that sure does not look like a crypto thing!<br>
<blockquote cite="mid:tencent_688EC67B109936B048868B39@qq.com"
type="cite"><font>
<div>Especially, while I execute </div>
<div> c:l(data_file_name)</div>
<div>repeatedly and quickly in the screen shell, the crash shows
up frequently.</div>
</font></blockquote>
Could you send me the file and an example program + your distro/os
environment, so I can reproduce it?<br>
Send it in a "private" mail to <a class="moz-txt-link-abbreviated" href="mailto:pan@erlang.org">pan@erlang.org</a>, you would maybe not
want to spread details of your system publicly...<br>
<blockquote cite="mid:tencent_688EC67B109936B048868B39@qq.com"
type="cite"><font>
<div> </div>
<div><br>
</div>
<div>Yesterday, I found that the stack memory size of my
application has not been set, which means, it was running on
the linux default stack size (10M). After I set the stack
size to 500M using ‘ulimit -s ’ command, and split the big
data file into small sub files, the situation becomes better.
Maybe the small stack size is the criminal, but I am not sure.
: (</div>
<div><br>
</div>
</font></blockquote>
Nah, that should not be a problem as long as you do not use external
libraries which go crazy on the C stack. We usually don't (except
for re, which relies on the PCRE library, that has a rather
aggressive approach to the C stack when compiling regexps).<br>
<blockquote cite="mid:tencent_688EC67B109936B048868B39@qq.com"
type="cite"><font>
<div>After all, is there any way to detect whether the crypto
NIF library is reloaded or not? </div>
<div>I've found a discussion about "fix native code crash when
calling unloaded module with on_load function":</div>
<div>
<a class="moz-txt-link-freetext" href="http://erlang.2086793.n4.nabble.com/fix-native-code-crash-when-calling-unloaded-module-with-on-load-function-td2273502.html">http://erlang.2086793.n4.nabble.com/fix-native-code-crash-when-calling-unloaded-module-with-on-load-function-td2273502.html</a></div>
<div>And I did have a suspicion on crypto module before, since
the crypto module has an on_load attribute.</div>
</font></blockquote>
In a few days, there will be a fix for this in the master branch,
that should remove the problem. But if you have an example you can
run from the shell and that I could reproduce it with, things would
be much faster. You could of course add a printout to the crypto
module to track the calls of the on_load handler. If it's called
more than twice, there will also be unloading (because of code
purging). You could also trace calls to erlang:purge_module/1.<br>
<blockquote cite="mid:tencent_688EC67B109936B048868B39@qq.com"
type="cite"><font>
<div><br>
</div>
<div>Cheers,</div>
</font>
<div>
<div style="font-size:14px;font-family:Verdana;color:#000;">
<div>[Adam Chan]</div>
</div>
</div>
</blockquote>
Cheers,<br>
/Patrik<br>
<blockquote cite="mid:tencent_688EC67B109936B048868B39@qq.com"
type="cite">
<div><includetail>
<div><br>
</div>
<div style="font-size: 12px;font-family: Arial
Narrow;padding:2px 0 2px 0;">------------------ Original ------------------</div>
<div style="font-size: 12px;background:#efefef;padding:8px;">
<div><b>From: </b> "Patrik Nyblom"<a class="moz-txt-link-rfc2396E" href="mailto:pan@erlang.org"><pan@erlang.org></a>;</div>
<div><b>Date: </b> Thu, Oct 18, 2012 08:08 PM</div>
<div><b>To: </b> "erlang-patches"<a class="moz-txt-link-rfc2396E" href="mailto:erlang-patches@erlang.org"><erlang-patches@erlang.org></a>;
<wbr></div>
<div><b>Subject: </b> Re: [erlang-patches] erlang node
crashes in erts_gc_after_bif_call</div>
</div>
<div><br>
</div>
Hi!<br>
<br>
Is the crypto NIF library reloaded during upgrade? That causes
havoc <br>
unfortunately, due to the behaviour of the OpenSSL crypto
memory <br>
allocation callbacks. We're working on that one.<br>
<br>
Have you reloaded the crypto NIF library, directly or
indirectly, when <br>
this happens?<br>
<br>
Cheers,<br>
/Patrik<br>
<br>
On 10/17/2012 03:47 AM, adam chan wrote:<br>
> hello list,<br>
><br>
> I met two random crash in this month, each crash
happened more than two<br>
> times. The causation was "Program terminated with signal
11, Segmentation<br>
> fault" and they most likely happened while I hot update
some module code<br>
> using code:soft_purge/1 and code:load_file/1.<br>
> Though they take place in different code, the
information from core file<br>
> points out that function erts_gc_after_bif_call/4 was
called while crash<br>
> happened. So I guess it is related to gc operation.<br>
> I am using otp_src_R15B02, smp mode.<br>
> Here are the information from core file (I am not
familiar with gdb ,<br>
> hope the information is useful)<br>
><br>
> [First One]<br>
> Reading symbols from /lib64/libutil.so.1...(no debugging
symbols<br>
> found)...done.<br>
> Loaded symbols for /lib64/libutil.so.1<br>
> Reading symbols from /lib64/libdl.so.2...(no debugging
symbols<br>
> found)...done.<br>
> Loaded symbols for /lib64/libdl.so.2<br>
> Reading symbols from /lib64/libm.so.6...(no debugging
symbols found)...done.<br>
> Loaded symbols for /lib64/libm.so.6<br>
> Reading symbols from /usr/lib64/libncurses.so.5...(no
debugging symbols<br>
> found)...done.<br>
> Loaded symbols for /usr/lib64/libncurses.so.5<br>
> Reading symbols from /lib64/libpthread.so.0...(no
debugging symbols<br>
> found)...done.<br>
> Loaded symbols for /lib64/libpthread.so.0<br>
> Reading symbols from /lib64/librt.so.1...(no debugging
symbols<br>
> found)...done.<br>
> Loaded symbols for /lib64/librt.so.1<br>
> Reading symbols from /lib64/libc.so.6...(no debugging
symbols found)...done.<br>
> Loaded symbols for /lib64/libc.so.6<br>
> Reading symbols from /lib64/ld-linux-x86-64.so.2...(no
debugging symbols<br>
> found)...done.<br>
> Loaded symbols for /lib64/ld-linux-x86-64.so.2<br>
> Reading symbols from<br>
>
/usr/local/lib/erlang/lib/crypto-2.2/priv/lib/crypto.so...done.<br>
> Loaded symbols for
/usr/local/lib/erlang/lib/crypto-2.2/priv/lib/crypto.so<br>
> Reading symbols from /lib64/libcrypto.so.6...(no
debugging symbols<br>
> found)...done.<br>
> Loaded symbols for /lib64/libcrypto.so.6<br>
> Reading symbols from /usr/lib64/libz.so.1...(no debugging
symbols<br>
> found)...done.<br>
> Loaded symbols for /usr/lib64/libz.so.1<br>
> Core was generated by
`/usr/local/lib/erlang/erts-5.9.2/bin/beam.smp -P<br>
> 1024000 -K true -- -root /usr/'.<br>
> Program terminated with signal 11, Segmentation fault.<br>
> #0 0x0000000000541b9a in check_process_code
(A__p=0x1a8c5b40,<br>
> BIF__ARGS=<value optimized out>) at
beam/beam_bif_load.c:487<br>
> 487 if (INSIDE((BeamInstr *)
funp->fe->address)) {<br>
> (gdb) bt<br>
> #0 0x0000000000541b9a in check_process_code
(A__p=0x1a8c5b40,<br>
> BIF__ARGS=<value optimized out>) at
beam/beam_bif_load.c:487<br>
> #1 check_process_code_2 (A__p=0x1a8c5b40,
BIF__ARGS=<value optimized out>)<br>
> at beam/beam_bif_load.c:205<br>
> #2 0x0000000000530782 in process_main () at
beam/beam_emu.c:3392<br>
> #3 0x00000000004a0b4f in sched_thread_func
(vesdp=<value optimized out>) at<br>
> beam/erl_process.c:5184<br>
> #4 0x00000000005a4f14 in thr_wrapper (vtwd=<value
optimized out>) at<br>
> pthread/ethread.c:110<br>
> #5 0x000000393fc0673d in start_thread () from
/lib64/libpthread.so.0<br>
> #6 0x000000393f4d3f6d in clone () from /lib64/libc.so.6<br>
> (gdb) p funp<br>
> $1 =<value optimized out><br>
> (gdb) p funp->fe<br>
> Cannot access memory at address 0x8<br>
><br>
> [Second One]<br>
> Reading symbols from /lib64/libutil.so.1...(no debugging
symbols<br>
> found)...done.<br>
> Loaded symbols for /lib64/libutil.so.1<br>
> Reading symbols from /lib64/libdl.so.2...(no debugging
symbols<br>
> found)...done.<br>
> Loaded symbols for /lib64/libdl.so.2<br>
> Reading symbols from /lib64/libm.so.6...(no debugging
symbols found)...done.<br>
> Loaded symbols for /lib64/libm.so.6<br>
> Reading symbols from /usr/lib64/libncurses.so.5...(no
debugging symbols<br>
> found)...done.<br>
> Loaded symbols for /usr/lib64/libncurses.so.5<br>
> Reading symbols from /lib64/libpthread.so.0...(no
debugging symbols<br>
> found)...done.<br>
> Loaded symbols for /lib64/libpthread.so.0<br>
> Reading symbols from /lib64/librt.so.1...(no debugging
symbols<br>
> found)...done.<br>
> Loaded symbols for /lib64/librt.so.1<br>
> Reading symbols from /lib64/libc.so.6...(no debugging
symbols found)...done.<br>
> Loaded symbols for /lib64/libc.so.6<br>
> Reading symbols from /lib64/ld-linux-x86-64.so.2...(no
debugging symbols<br>
> found)...done.<br>
> Loaded symbols for /lib64/ld-linux-x86-64.so.2<br>
> Reading symbols from<br>
>
/usr/local/lib/erlang/lib/crypto-2.2/priv/lib/crypto.so...done.<br>
> Loaded symbols for
/usr/local/lib/erlang/lib/crypto-2.2/priv/lib/crypto.so<br>
> Reading symbols from /lib64/libcrypto.so.6...(no
debugging symbols<br>
> found)...done.<br>
> Loaded symbols for /lib64/libcrypto.so.6<br>
> Reading symbols from /usr/lib64/libz.so.1...(no debugging
symbols<br>
> found)...done.<br>
> Loaded symbols for /usr/lib64/libz.so.1<br>
> Core was generated by
`/usr/local/lib/erlang/erts-5.9.2/bin/beam.smp -P<br>
> 1024000 -K true -- -root /usr/'.<br>
> Program terminated with signal 11, Segmentation fault.<br>
> #0 0x00000000004f8bac in sweep_off_heap
(p=0x2aaabc644b90, fullsweep=0) at<br>
> beam/erl_gc.c:2302<br>
> 2302 ptr = ptr->next;<br>
> (gdb) bt<br>
> #0 0x00000000004f8bac in sweep_off_heap
(p=0x2aaabc644b90, fullsweep=0) at<br>
> beam/erl_gc.c:2302<br>
> #1 0x00000000004fabb8 in do_minor (p=0x2aaabc644b90,
need=0,<br>
> objv=0x44719e00, nobj=1, recl=0x44719db8) at
beam/erl_gc.c:1133<br>
> #2 minor_collection (p=0x2aaabc644b90, need=0,
objv=0x44719e00, nobj=1,<br>
> recl=0x44719db8) at beam/erl_gc.c:827<br>
> #3 0x00000000004fc40d in erts_garbage_collect
(p=0x2aaabc644b90, need=0,<br>
> objv=0x44719e00, nobj=1) at beam/erl_gc.c:405<br>
> #4 0x00000000004fcdcf in erts_gc_after_bif_call
(p=0x2aaabc644b90,<br>
> result=46912734893994, regs=<value optimized out>,
arity=<value optimized<br>
> out>) at beam/erl_gc.c:335<br>
> #5 0x00000000005309c1 in process_main () at
beam/beam_emu.c:2600<br>
> #6 0x00000000004a0b4f in sched_thread_func
(vesdp=<value optimized out>) at<br>
> beam/erl_process.c:5184<br>
> #7 0x00000000005a4f14 in thr_wrapper (vtwd=<value
optimized out>) at<br>
> pthread/ethread.c:110<br>
> #8 0x000000393fc0673d in start_thread () from
/lib64/libpthread.so.0<br>
> #9 0x000000393f4d3f6d in clone () from /lib64/libc.so.6<br>
> (gdb) p ptr<br>
> $1 = (struct erl_off_heap_header *) 0x2aab02d63980<br>
> (gdb) x/x 0x2aab02d63980<br>
> 0x2aab02d63980: 0x000000f0<br>
> (gdb) p ptr->next<br>
> $2 = (struct erl_off_heap_header *) 0x2aab02d5ee80<br>
> (gdb) x/x 0x2aab02d5ee80<br>
> 0x2aab02d5ee80: 0x00000160<br>
><br>
> Any ideas? Thanks a lot.<br>
><br>
><br>
><br>
> --<br>
> View this message in context:
<a class="moz-txt-link-freetext" href="http://erlang.2086793.n4.nabble.com/erlang-node-crashes-in-erts-gc-after-bif-call-tp4655148.html">http://erlang.2086793.n4.nabble.com/erlang-node-crashes-in-erts-gc-after-bif-call-tp4655148.html</a><br>
> Sent from the Erlang Patches mailing list archive at
Nabble.com.<br>
> _______________________________________________<br>
> erlang-patches mailing list<br>
> <a class="moz-txt-link-abbreviated" href="mailto:erlang-patches@erlang.org">erlang-patches@erlang.org</a><br>
> <a class="moz-txt-link-freetext" href="http://erlang.org/mailman/listinfo/erlang-patches">http://erlang.org/mailman/listinfo/erlang-patches</a><br>
<br>
_______________________________________________<br>
erlang-patches mailing list<br>
<a class="moz-txt-link-abbreviated" href="mailto:erlang-patches@erlang.org">erlang-patches@erlang.org</a><br>
<a class="moz-txt-link-freetext" href="http://erlang.org/mailman/listinfo/erlang-patches">http://erlang.org/mailman/listinfo/erlang-patches</a><br>
</includetail></div>
<br>
<fieldset class="mimeAttachmentHeader"></fieldset>
<br>
<pre wrap="">_______________________________________________
erlang-patches mailing list
<a class="moz-txt-link-abbreviated" href="mailto:erlang-patches@erlang.org">erlang-patches@erlang.org</a>
<a class="moz-txt-link-freetext" href="http://erlang.org/mailman/listinfo/erlang-patches">http://erlang.org/mailman/listinfo/erlang-patches</a>
</pre>
</blockquote>
<br>
</body>
</html>