<html>
  <head>
    <meta content="text/html; charset=ISO-8859-1"
      http-equiv="Content-Type">
  </head>
  <body text="#000000" bgcolor="#FFFFFF">
    Hi!<br>
    <br>
    On 10/19/2012 10:42 AM, adam chan wrote:
    <blockquote cite="mid:tencent_688EC67B109936B048868B39@qq.com"
      type="cite">
      <meta http-equiv="Content-Type" content="text/html;
        charset=ISO-8859-1">
      <font>
        <div>Hi!</div>
        <div><br>
        </div>
        <div>I don't think the crypto NIF library was reloaded, though
          the mysql client(Author: Magnus Ahltorp
          <a class="moz-txt-link-rfc2396E" href="mailto:ahltorp@nada.kth.se"><ahltorp@nada.kth.se></a>) in my project does use crypto
          library. </div>
        <div><br>
        </div>
        <div>When I upgrade a big data file(600K) which only includes
          thousands of lines like:</div>
        <div>     get(xxx) -> #record{a = xx, b = xx};</div>
        <div>, the crash happens.  I guess this data file has no
          relationship with crypto NIF library?  </div>
        <div><br>
        </div>
      </font></blockquote>
    Oups, that sure does not look like a crypto thing!<br>
    <blockquote cite="mid:tencent_688EC67B109936B048868B39@qq.com"
      type="cite"><font>
        <div>Especially, while I execute </div>
        <div>     c:l(data_file_name)</div>
        <div>repeatedly and quickly in the screen shell, the crash shows
          up frequently.</div>
      </font></blockquote>
    Could you send me the file and an example program + your distro/os
    environment, so I can reproduce it?<br>
    Send it in a "private" mail to <a class="moz-txt-link-abbreviated" href="mailto:pan@erlang.org">pan@erlang.org</a>, you would maybe not
    want to spread details of your system publicly...<br>
    <blockquote cite="mid:tencent_688EC67B109936B048868B39@qq.com"
      type="cite"><font>
        <div> </div>
        <div><br>
        </div>
        <div>Yesterday,  I found that the stack memory size of my
          application has not been set, which means, it was running on
          the linux default stack size (10M).  After I set the stack
          size to 500M using ‘ulimit -s ’ command, and split the big
           data file into small sub files, the situation becomes better.
          Maybe the small stack size is the criminal, but I am not sure.
            : (</div>
        <div><br>
        </div>
      </font></blockquote>
    Nah, that should not be a problem as long as you do not use external
    libraries which go crazy on the C stack. We usually don't (except
    for re, which relies on the PCRE library, that has a rather
    aggressive approach to the C stack when compiling regexps).<br>
    <blockquote cite="mid:tencent_688EC67B109936B048868B39@qq.com"
      type="cite"><font>
        <div>After all, is there any way to detect whether the crypto
          NIF library is reloaded or not? </div>
        <div>I've found a discussion about "fix native code crash when
          calling unloaded module with on_load function":</div>
        <div>   
 <a class="moz-txt-link-freetext" href="http://erlang.2086793.n4.nabble.com/fix-native-code-crash-when-calling-unloaded-module-with-on-load-function-td2273502.html">http://erlang.2086793.n4.nabble.com/fix-native-code-crash-when-calling-unloaded-module-with-on-load-function-td2273502.html</a></div>
        <div>And I did have a suspicion on crypto module before, since
          the crypto module has an on_load attribute.</div>
      </font></blockquote>
    In a few days, there will be a fix for this in the master branch,
    that should remove the problem. But if you have an example you can
    run from the shell and that I could reproduce it with, things would
    be much faster. You could of course add a printout to the crypto
    module to track the calls of the on_load handler. If it's called
    more than twice, there will also be unloading (because of code
    purging). You could also trace calls to erlang:purge_module/1.<br>
    <blockquote cite="mid:tencent_688EC67B109936B048868B39@qq.com"
      type="cite"><font>
        <div><br>
        </div>
        <div>Cheers,</div>
      </font>
      <div>
        <div style="font-size:14px;font-family:Verdana;color:#000;">
          <div>[Adam Chan]</div>
        </div>
      </div>
    </blockquote>
    Cheers,<br>
    /Patrik<br>
    <blockquote cite="mid:tencent_688EC67B109936B048868B39@qq.com"
      type="cite">
      <div><includetail>
          <div><br>
          </div>
          <div style="font-size: 12px;font-family: Arial
            Narrow;padding:2px 0 2px 0;">------------------ Original ------------------</div>
          <div style="font-size: 12px;background:#efefef;padding:8px;">
            <div><b>From: </b> "Patrik Nyblom"<a class="moz-txt-link-rfc2396E" href="mailto:pan@erlang.org"><pan@erlang.org></a>;</div>
            <div><b>Date: </b> Thu, Oct 18, 2012 08:08 PM</div>
            <div><b>To: </b> "erlang-patches"<a class="moz-txt-link-rfc2396E" href="mailto:erlang-patches@erlang.org"><erlang-patches@erlang.org></a>;
              <wbr></div>
            <div><b>Subject: </b> Re: [erlang-patches] erlang node
              crashes in erts_gc_after_bif_call</div>
          </div>
          <div><br>
          </div>
          Hi!<br>
          <br>
          Is the crypto NIF library reloaded during upgrade? That causes
          havoc <br>
          unfortunately, due to the behaviour of the OpenSSL crypto
          memory <br>
          allocation callbacks. We're working on that one.<br>
          <br>
          Have you reloaded the crypto NIF library, directly or
          indirectly, when <br>
          this happens?<br>
          <br>
          Cheers,<br>
          /Patrik<br>
          <br>
          On 10/17/2012 03:47 AM, adam chan wrote:<br>
          > hello list,<br>
          ><br>
          >      I met two random crash in this month, each crash
          happened more than two<br>
          > times. The causation was "Program terminated with signal
          11, Segmentation<br>
          > fault" and they most likely happened while I hot update
          some module code<br>
          > using code:soft_purge/1 and code:load_file/1.<br>
          >      Though they take place in different code, the
          information from core file<br>
          > points out that function erts_gc_after_bif_call/4 was
          called while crash<br>
          > happened. So I guess it is related to gc operation.<br>
          >      I am using otp_src_R15B02, smp mode.<br>
          >      Here are the information from core file (I am not
          familiar with gdb ,<br>
          > hope the information is useful)<br>
          ><br>
          >      [First One]<br>
          > Reading symbols from /lib64/libutil.so.1...(no debugging
          symbols<br>
          > found)...done.<br>
          > Loaded symbols for /lib64/libutil.so.1<br>
          > Reading symbols from /lib64/libdl.so.2...(no debugging
          symbols<br>
          > found)...done.<br>
          > Loaded symbols for /lib64/libdl.so.2<br>
          > Reading symbols from /lib64/libm.so.6...(no debugging
          symbols found)...done.<br>
          > Loaded symbols for /lib64/libm.so.6<br>
          > Reading symbols from /usr/lib64/libncurses.so.5...(no
          debugging symbols<br>
          > found)...done.<br>
          > Loaded symbols for /usr/lib64/libncurses.so.5<br>
          > Reading symbols from /lib64/libpthread.so.0...(no
          debugging symbols<br>
          > found)...done.<br>
          > Loaded symbols for /lib64/libpthread.so.0<br>
          > Reading symbols from /lib64/librt.so.1...(no debugging
          symbols<br>
          > found)...done.<br>
          > Loaded symbols for /lib64/librt.so.1<br>
          > Reading symbols from /lib64/libc.so.6...(no debugging
          symbols found)...done.<br>
          > Loaded symbols for /lib64/libc.so.6<br>
          > Reading symbols from /lib64/ld-linux-x86-64.so.2...(no
          debugging symbols<br>
          > found)...done.<br>
          > Loaded symbols for /lib64/ld-linux-x86-64.so.2<br>
          > Reading symbols from<br>
          >
          /usr/local/lib/erlang/lib/crypto-2.2/priv/lib/crypto.so...done.<br>
          > Loaded symbols for
          /usr/local/lib/erlang/lib/crypto-2.2/priv/lib/crypto.so<br>
          > Reading symbols from /lib64/libcrypto.so.6...(no
          debugging symbols<br>
          > found)...done.<br>
          > Loaded symbols for /lib64/libcrypto.so.6<br>
          > Reading symbols from /usr/lib64/libz.so.1...(no debugging
          symbols<br>
          > found)...done.<br>
          > Loaded symbols for /usr/lib64/libz.so.1<br>
          > Core was generated by
          `/usr/local/lib/erlang/erts-5.9.2/bin/beam.smp -P<br>
          > 1024000 -K true -- -root /usr/'.<br>
          > Program terminated with signal 11, Segmentation fault.<br>
          > #0  0x0000000000541b9a in check_process_code
          (A__p=0x1a8c5b40,<br>
          > BIF__ARGS=<value optimized out>) at
          beam/beam_bif_load.c:487<br>
          > 487                 if (INSIDE((BeamInstr *)
          funp->fe->address)) {<br>
          > (gdb) bt<br>
          > #0  0x0000000000541b9a in check_process_code
          (A__p=0x1a8c5b40,<br>
          > BIF__ARGS=<value optimized out>) at
          beam/beam_bif_load.c:487<br>
          > #1  check_process_code_2 (A__p=0x1a8c5b40,
          BIF__ARGS=<value optimized out>)<br>
          > at beam/beam_bif_load.c:205<br>
          > #2  0x0000000000530782 in process_main () at
          beam/beam_emu.c:3392<br>
          > #3  0x00000000004a0b4f in sched_thread_func
          (vesdp=<value optimized out>) at<br>
          > beam/erl_process.c:5184<br>
          > #4  0x00000000005a4f14 in thr_wrapper (vtwd=<value
          optimized out>) at<br>
          > pthread/ethread.c:110<br>
          > #5  0x000000393fc0673d in start_thread () from
          /lib64/libpthread.so.0<br>
          > #6  0x000000393f4d3f6d in clone () from /lib64/libc.so.6<br>
          > (gdb) p funp<br>
          > $1 =<value optimized out><br>
          > (gdb) p funp->fe<br>
          > Cannot access memory at address 0x8<br>
          ><br>
          >      [Second One]<br>
          > Reading symbols from /lib64/libutil.so.1...(no debugging
          symbols<br>
          > found)...done.<br>
          > Loaded symbols for /lib64/libutil.so.1<br>
          > Reading symbols from /lib64/libdl.so.2...(no debugging
          symbols<br>
          > found)...done.<br>
          > Loaded symbols for /lib64/libdl.so.2<br>
          > Reading symbols from /lib64/libm.so.6...(no debugging
          symbols found)...done.<br>
          > Loaded symbols for /lib64/libm.so.6<br>
          > Reading symbols from /usr/lib64/libncurses.so.5...(no
          debugging symbols<br>
          > found)...done.<br>
          > Loaded symbols for /usr/lib64/libncurses.so.5<br>
          > Reading symbols from /lib64/libpthread.so.0...(no
          debugging symbols<br>
          > found)...done.<br>
          > Loaded symbols for /lib64/libpthread.so.0<br>
          > Reading symbols from /lib64/librt.so.1...(no debugging
          symbols<br>
          > found)...done.<br>
          > Loaded symbols for /lib64/librt.so.1<br>
          > Reading symbols from /lib64/libc.so.6...(no debugging
          symbols found)...done.<br>
          > Loaded symbols for /lib64/libc.so.6<br>
          > Reading symbols from /lib64/ld-linux-x86-64.so.2...(no
          debugging symbols<br>
          > found)...done.<br>
          > Loaded symbols for /lib64/ld-linux-x86-64.so.2<br>
          > Reading symbols from<br>
          >
          /usr/local/lib/erlang/lib/crypto-2.2/priv/lib/crypto.so...done.<br>
          > Loaded symbols for
          /usr/local/lib/erlang/lib/crypto-2.2/priv/lib/crypto.so<br>
          > Reading symbols from /lib64/libcrypto.so.6...(no
          debugging symbols<br>
          > found)...done.<br>
          > Loaded symbols for /lib64/libcrypto.so.6<br>
          > Reading symbols from /usr/lib64/libz.so.1...(no debugging
          symbols<br>
          > found)...done.<br>
          > Loaded symbols for /usr/lib64/libz.so.1<br>
          > Core was generated by
          `/usr/local/lib/erlang/erts-5.9.2/bin/beam.smp -P<br>
          > 1024000 -K true -- -root /usr/'.<br>
          > Program terminated with signal 11, Segmentation fault.<br>
          > #0  0x00000000004f8bac in sweep_off_heap
          (p=0x2aaabc644b90, fullsweep=0) at<br>
          > beam/erl_gc.c:2302<br>
          > 2302                ptr = ptr->next;<br>
          > (gdb) bt<br>
          > #0  0x00000000004f8bac in sweep_off_heap
          (p=0x2aaabc644b90, fullsweep=0) at<br>
          > beam/erl_gc.c:2302<br>
          > #1  0x00000000004fabb8 in do_minor (p=0x2aaabc644b90,
          need=0,<br>
          > objv=0x44719e00, nobj=1, recl=0x44719db8) at
          beam/erl_gc.c:1133<br>
          > #2  minor_collection (p=0x2aaabc644b90, need=0,
          objv=0x44719e00, nobj=1,<br>
          > recl=0x44719db8) at beam/erl_gc.c:827<br>
          > #3  0x00000000004fc40d in erts_garbage_collect
          (p=0x2aaabc644b90, need=0,<br>
          > objv=0x44719e00, nobj=1) at beam/erl_gc.c:405<br>
          > #4  0x00000000004fcdcf in erts_gc_after_bif_call
          (p=0x2aaabc644b90,<br>
          > result=46912734893994, regs=<value optimized out>,
          arity=<value optimized<br>
          > out>) at beam/erl_gc.c:335<br>
          > #5  0x00000000005309c1 in process_main () at
          beam/beam_emu.c:2600<br>
          > #6  0x00000000004a0b4f in sched_thread_func
          (vesdp=<value optimized out>) at<br>
          > beam/erl_process.c:5184<br>
          > #7  0x00000000005a4f14 in thr_wrapper (vtwd=<value
          optimized out>) at<br>
          > pthread/ethread.c:110<br>
          > #8  0x000000393fc0673d in start_thread () from
          /lib64/libpthread.so.0<br>
          > #9  0x000000393f4d3f6d in clone () from /lib64/libc.so.6<br>
          > (gdb) p ptr<br>
          > $1 = (struct erl_off_heap_header *) 0x2aab02d63980<br>
          > (gdb) x/x 0x2aab02d63980<br>
          > 0x2aab02d63980: 0x000000f0<br>
          > (gdb) p ptr->next<br>
          > $2 = (struct erl_off_heap_header *) 0x2aab02d5ee80<br>
          > (gdb) x/x 0x2aab02d5ee80<br>
          > 0x2aab02d5ee80: 0x00000160<br>
          ><br>
          >      Any ideas? Thanks a lot.<br>
          ><br>
          ><br>
          ><br>
          > --<br>
          > View this message in context:
<a class="moz-txt-link-freetext" href="http://erlang.2086793.n4.nabble.com/erlang-node-crashes-in-erts-gc-after-bif-call-tp4655148.html">http://erlang.2086793.n4.nabble.com/erlang-node-crashes-in-erts-gc-after-bif-call-tp4655148.html</a><br>
          > Sent from the Erlang Patches mailing list archive at
          Nabble.com.<br>
          > _______________________________________________<br>
          > erlang-patches mailing list<br>
          > <a class="moz-txt-link-abbreviated" href="mailto:erlang-patches@erlang.org">erlang-patches@erlang.org</a><br>
          > <a class="moz-txt-link-freetext" href="http://erlang.org/mailman/listinfo/erlang-patches">http://erlang.org/mailman/listinfo/erlang-patches</a><br>
          <br>
          _______________________________________________<br>
          erlang-patches mailing list<br>
          <a class="moz-txt-link-abbreviated" href="mailto:erlang-patches@erlang.org">erlang-patches@erlang.org</a><br>
          <a class="moz-txt-link-freetext" href="http://erlang.org/mailman/listinfo/erlang-patches">http://erlang.org/mailman/listinfo/erlang-patches</a><br>
        </includetail></div>
      <br>
      <fieldset class="mimeAttachmentHeader"></fieldset>
      <br>
      <pre wrap="">_______________________________________________
erlang-patches mailing list
<a class="moz-txt-link-abbreviated" href="mailto:erlang-patches@erlang.org">erlang-patches@erlang.org</a>
<a class="moz-txt-link-freetext" href="http://erlang.org/mailman/listinfo/erlang-patches">http://erlang.org/mailman/listinfo/erlang-patches</a>
</pre>
    </blockquote>
    <br>
  </body>
</html>