<html>
  <head>
    <meta content="text/html; charset=KOI8-R" http-equiv="Content-Type">
  </head>
  <body bgcolor="#FFFFFF" text="#000000">
    <div class="moz-cite-prefix">Hi!<br>
      <br>
      Thanks to everyone helping out trying to find this bug! <br>
      <br>
      With the help of Denis, I have now a verified fix for the garbage
      collector bug which moved a "fixed" (and writable) binary in the
      middle of erts_bs_append (erts_bs_append in erl_bits.c was the
      "innocent bystander" triggering the gc bug). The bugfix will be a
      last minute contribution to R15B03, but I also attach a source
      patch to this mail. <br>
      <br>
      Cheers,<br>
      /Patrik<br>
      On 11/21/2012 05:35 PM, Musumeci, Antonio S wrote:<br>
    </div>
    <blockquote
cite="mid:51C6F20DC46369418387C5250127649B03B8FF@HZWEX2014N4.msad.ms.com"
      type="cite"><!-- Template generated by Exclaimer Template Editor on 11:35:02 Wednesday, 21 November 2012 -->
      <meta http-equiv="Content-Type" content="text/html;
        charset=KOI8-R">
      <style type="text/css">P.cd987f72-e700-448b-b4e1-7fb38b81e891 {
        MARGIN: 0cm 0cm 0pt
}
LI.cd987f72-e700-448b-b4e1-7fb38b81e891 {
        MARGIN: 0cm 0cm 0pt
}
DIV.cd987f72-e700-448b-b4e1-7fb38b81e891 {
        MARGIN: 0cm 0cm 0pt
}
TABLE.cd987f72-e700-448b-b4e1-7fb38b81e891Table {
        MARGIN: 0cm 0cm 0pt
}
DIV.Section1 {
        page: Section1
}
</style>
      <meta content="MSHTML 6.00.6000.21316" name="GENERATOR">
      <p>
      </p>
      <div dir="ltr" align="left"><span class="516193016-21112012"><font
            color="#0000ff" face="Arial" size="2">Something my team just
            noticed was that our segv occurs right after reboot of the
            box consistantly. After which beam appears to work alright.
            We are trying to narrow down what code is triggering it but
            it may take some time.</font></span></div>
      <br>
      <div class="OutlookMessageHeader" dir="ltr" align="left"
        lang="en-us">
        <hr tabindex="-1">
        <font face="Tahoma" size="2"><b>From:</b> Patrik Nyblom
          [<a class="moz-txt-link-freetext" href="mailto:pan@erlang.org">mailto:pan@erlang.org</a>] <br>
          <b>Sent:</b> Wednesday, November 21, 2012 6:09 AM<br>
          <b>To:</b> Denis Titoruk<br>
          <b>Cc:</b> Musumeci, Antonio S (Enterprise Infrastructure);
          <a class="moz-txt-link-abbreviated" href="mailto:erlang-bugs@erlang.org">erlang-bugs@erlang.org</a><br>
          <b>Subject:</b> Re: [erlang-bugs] beam core'ing<br>
        </font><br>
      </div>
      <div class="moz-cite-prefix">Hi again :)<br>
        <br>
        Another thing that would be helpful is if you could create a
        crash dump instead of a fprintf when the binary is wrongly
        moved, i.e. call erl_exit(ERTS_DUMP_EXIT, "erts_current_bin !=
        (pb->bytes)"); instead of the fprintf? Then you could isolate
        the erlang code snippet that exercises the bug and I maybe could
        create a smaller testcase... A simple testcase when diving into
        the GC would be really helpful :)<br>
        <br>
        Cheers,<br>
        /Patrik<br>
        <br>
        On 11/21/2012 11:21 AM, Denis Titoruk wrote:<br>
      </div>
      <blockquote
        cite="mid:8436D993-C4FC-4822-B0E8-7A2D6AB2E0C9@gmail.com"
        type="cite">
        <br>
        <div>
          <div>21.11.2012, Χ 13:44, Patrik Nyblom ΞΑΠΙΣΑΜ(Α):</div>
          <br class="Apple-interchange-newline">
          <blockquote type="cite">
            <div text="#000000" bgcolor="#FFFFFF">
              <div class="moz-cite-prefix">Hi!<br>
                On 11/20/2012 10:40 PM, Denis Titoruk wrote:<br>
              </div>
              <blockquote
                cite="mid:79133563-669F-4FDD-8982-01DB7B321DA5@aboutecho.com"
                type="cite">
                <base href="x-msg://13417/">
                <div>Hi,</div>
                <div><br>
                </div>
                We've got the same error on R15B01, R15B02
                <div>I've finished my investigation of this issue today
                  & here is result:</div>
                <div><br>
                </div>
                <div>Let's assume we have the code:</div>
                <div>encode_formats(Columns) -><br>
                  šššencode_formats(Columns, 0, <<>>).<br>
                  <br>
                  encode_formats([], Count, Acc) -><br>
                  ššš<<Count:?int16, Acc/binary>>;<br>
                  <br>
                  encode_formats([#column{format = Format} | T], Count,
                  Acc) -><br>
                  šššencode_formats(T, Count + 1, <<Acc/binary,
                  Format:?int16>>).<br>
                </div>
                <div><br>
                </div>
                <div>So,š<<Acc/binary,
                  Format:?int16>>štranslates to</div>
                <div><br>
                </div>
                <div>š
                  š{bs_append,{f,0},{integer,16},0,7,8,{x,2},{field_flags,[]},{x,1}}.<br>
ššš{bs_put_integer,{f,0},{integer,16},1,{field_flags,[signed,big]},{x,6}}.<br>
                </div>
                <div><br>
                </div>
                <div>There is GC execution in bs_append and it can
                  reallocate binary but there isn't
                  reassigningšerts_current_bin which used
                  inšbs_put_integer.</div>
                <div><br>
                </div>
                <div>Fix:</div>
                <div><br>
                </div>
                <div>erl_bits.c:<br>
                  Eterm<br>
                  erts_bs_append(Process* c_p, Eterm* reg, Uint live,
                  Eterm build_size_term,<br>
                  šššššššUint extra_words, Uint unit)<br>
                  …<br>
                  šššif (c_p->stop - c_p->htop < heap_need) {<br>
                  ššššššš(void) erts_garbage_collect(c_p, heap_need,
                  reg, live+1);<br>
                  ššš}<br>
                  šššsb = (ErlSubBin *) c_p->htop;<br>
                  šššc_p->htop += ERL_SUB_BIN_SIZE;<br>
                  šššsb->thing_word = HEADER_SUB_BIN;<br>
                  šššsb->size = BYTE_OFFSET(used_size_in_bits);<br>
                  šššsb->bitsize = BIT_OFFSET(used_size_in_bits);<br>
                  šššsb->offs = 0;<br>
                  šššsb->bitoffs = 0;<br>
                  šššsb->is_writable = 1;<br>
                  šššsb->orig = reg[live];<br>
                  <br>
///////////////////////////////////////////////////////////////////</div>
                <div>// add this lines</div>
                <div>
                  <div>///////////////////////////////////////////////////////////////////</div>
                </div>
                <div>š špb = (ProcBin *) boxed_val(sb->orig);</div>
                <div>šššerts_current_bin = pb->bytes;<br>
                  šššerts_writable_bin = 1;<br>
///////////////////////////////////////////////////////////////////<br>
                  <br>
                  šššreturn make_binary(sb);<br>
                  …<br>
                </div>
                <div><br>
                </div>
              </blockquote>
              Can you reproduce the bug and verify that this fix really
              works? The thing is that binaries should *only* be
              reallocated in the gc if there are no active writers,
              which there obviously is here (ššš pb->flags |=
              PB_ACTIVE_WRITERš a few lines earlier), so the bug would
              be in the detection of active writers in the gc if this
              code change actually removes the crash.<br>
            </div>
          </blockquote>
          <div><br>
          </div>
          <div>Yes, it works in my case. I haven't simple test case for
            reproducing this bug (actually I run few processes to send
            requests to pgsql)</div>
          <div><br>
          </div>
          <div>
            <div>š š pb = (ProcBin *) boxed_val(sb->orig);</div>
            <div>š š if (erts_current_bin != (pb->bytes)) {</div>
            <div>š š š š fprintf(stderr, "erts_current_bin !=
              (pb->bytes)\n");</div>
            <div>š š š š fflush(stderr);</div>
            <div>š š }</div>
            <div>š š erts_current_bin = pb->bytes;</div>
            <div>š š erts_writable_bin = 1;</div>
          </div>
          <div><br>
          </div>
          <div><br>
          </div>
          <div>
            <div>(jskit@siden)1> f(F), F = fun() ->
              postgresql:equery('echo-customers', write, <<"some
              query here">>, []) end.</div>
            <div>#Fun<erl_eval.20.82930912></div>
            <div>(jskit@siden)2> perftest:comprehensive(1000, F).</div>
            <div>Sequential 100 cycles in ~1 seconds (100 cycles/s)</div>
            <div>Sequential 200 cycles in ~2 seconds (106 cycles/s)</div>
            <div>Sequential 1000 cycles in ~12 seconds (85 cycles/s)</div>
            <div>Parallel 2 1000 cycles in ~8 seconds (132 cycles/s)</div>
            <div>Parallel 4 1000 cycles in ~8 seconds (121 cycles/s)</div>
            <div>Parallel 10 1000 cycles in ~8 seconds (119 cycles/s)</div>
            <div>Parallel 100 1000 cycles in ~13 seconds (74 cycles/s)</div>
            <div>[85,132,121,119,74]</div>
            <div>(jskit@siden)3> perftest:comprehensive(1000, F). š š
              š š š š š š š š š š š š š š</div>
            <div>Sequential 100 cycles in ~1 seconds (83 cycles/s) š š š
              š š š š š š š š š š š šš</div>
            <div>Sequential 200 cycles in ~2 seconds (83 cycles/s) š š š
              š š š š š š š š š š š šš</div>
            <div>Sequential 1000 cycles in ~14 seconds (71 cycles/s) š š
              š š š š š š š š š š š šš</div>
            <div>Parallel 2 1000 cycles in ~11 seconds (95 cycles/s) š š
              š š š š š š š š š š š šš</div>
            <div>Parallel 4 1000 cycles in ~10 seconds (105 cycles/s) š
              š š š š š š š š š š š š š</div>
            <div>Parallel 10 1000 cycles in ~11 seconds (91 cycles/s)</div>
            <div>Parallel 100 1000 cycles in ~13 seconds (76 cycles/s)</div>
            <div>"G_i[L"</div>
            <div>(jskit@siden)4> perftest:comprehensive(1000, F).</div>
            <div>Sequential 100 cycles in ~1 seconds (88 cycles/s)</div>
            <div>Sequential 200 cycles in ~2 seconds (85 cycles/s)</div>
            <div>Sequential 1000 cycles in ~13 seconds (74 cycles/s)</div>
            <div>Parallel 2 1000 cycles in ~9 seconds (109 cycles/s)</div>
            <div>Parallel 4 1000 cycles in ~10 seconds (101 cycles/s)</div>
            <div>Parallel 10 1000 cycles in ~11 seconds (95 cycles/s)</div>
            <div>erts_current_bin != (pb->bytes)</div>
            <div>Parallel 100 1000 cycles in ~13 seconds (77 cycles/s)</div>
            <div>"Jme_M"</div>
          </div>
          <br>
          <blockquote type="cite">
            <div text="#000000" bgcolor="#FFFFFF"><br>
              <blockquote
                cite="mid:79133563-669F-4FDD-8982-01DB7B321DA5@aboutecho.com"
                type="cite">
                <div><br>
                </div>
                <div>--</div>
                <div>Cheers,</div>
                <div>Denis</div>
              </blockquote>
              Cheers,<br>
              /Patrik<br>
              <blockquote
                cite="mid:79133563-669F-4FDD-8982-01DB7B321DA5@aboutecho.com"
                type="cite">
                <div><br>
                  <div>
                    <div>20.11.2012, Χ 19:37, Musumeci, Antonio S
                      ΞΑΠΙΣΑΜ(Α):</div>
                    <br class="Apple-interchange-newline">
                    <blockquote type="cite">
                      <div text="#000000" bgcolor="#ffffff">
                        <div><br class="webkit-block-placeholder">
                        </div>
                        <div dir="ltr" align="left"><font
                            color="#0000ff" face="Arial" size="2">
                            <p align="left"><font color="#0000ff"
                                face="Arial" size="2"><font
                                  color="#0000ff" face="Arial" size="2"><font
                                    color="#0000ff" face="Arial"
                                    size="2">I've got lots of cores...
                                    but they are all from optimized
                                    builds.</font></font></font></p>
                            <font color="#0000ff" face="Arial" size="2"><font
                                color="#0000ff" face="Arial" size="2"><font
                                  color="#0000ff" face="Arial" size="2">
                                  <p dir="ltr" align="left">Has this
                                    been seen in other versions? We are
                                    keen to solve this because it's
                                    causing us pain in production. We
                                    hit another, older, memory bug (the
                                    32bit values used in 64bit build)...
                                    and now this.</p>
                                </font></font></font>
                            <p dir="ltr" align="left"><font
                                color="#0000ff" face="Arial" size="2"><font
                                  color="#0000ff" face="Arial" size="2"><font
                                    color="#0000ff" face="Arial"
                                    size="2">I'm going to be building
                                    and trying R15B01 to see if we hit
                                    it as well. I'll send any additional
                                    information I can.</font></font></font><font
                                color="#000000" face="Times New Roman"
                                size="3">š<span
                                  class="403263615-20112012"><font
                                    color="#0000ff" face="Arial"
                                    size="2">Any suggestions on
                                    debugging beam would be appreciated.
                                    Compile options, etc.</font></span></font></p>
                            <p dir="ltr" align="left">Thanks.</p>
                          </font>
                          <p dir="ltr" align="left"><font
                              color="#0000ff" face="Arial" size="2"><font
                                color="#0000ff" face="Arial" size="2"><font
                                  color="#0000ff" face="Arial" size="2"><font
                                    color="#0000ff" face="Arial"
                                    size="2">-antonio</font></font></font></font><br>
                          </p>
                        </div>
                        <div class="OutlookMessageHeader" dir="ltr"
                          align="left" lang="en-us">
                          <hr tabindex="-1">
                          <font face="Tahoma" size="2"><b>From:</b><span
                              class="Apple-converted-space">š</span><a
                              href="mailto:erlang-bugs-bounces@erlang.org"
                              moz-do-not-send="true">erlang-bugs-bounces@erlang.org</a><span
                              class="Apple-converted-space">š</span>[<a
                              class="moz-txt-link-freetext"
                              href="mailto:erlang-bugs-bounces@erlang.org"
                              moz-do-not-send="true">mailto:erlang-bugs-bounces@erlang.org</a>]<span
                              class="Apple-converted-space">š</span><b>On

                              Behalf Of<span
                                class="Apple-converted-space">š</span></b>Patrik
                            Nyblom<br>
                            <b>Sent:</b><span
                              class="Apple-converted-space">š</span>Monday,
                            November 19, 2012 8:55 AM<br>
                            <b>To:</b><span
                              class="Apple-converted-space">š</span><a
                              href="mailto:erlang-bugs@erlang.org"
                              moz-do-not-send="true">erlang-bugs@erlang.org</a><br>
                            <b>Subject:</b><span
                              class="Apple-converted-space">š</span>Re:
                            [erlang-bugs] beam core'ing<br>
                          </font><br>
                        </div>
                        <div class="moz-cite-prefix">On 11/19/2012 02:01
                          PM, Musumeci, Antonio S wrote:<br>
                        </div>
                        <blockquote
cite="mid:51C6F20DC46369418387C5250127649B039D96@HZWEX2014N4.msad.ms.com"
                          type="cite">
                          <div><br class="webkit-block-placeholder">
                          </div>
                          <div><span lang="EN">
                              <p><span class="483463912-19112012"><font
                                    face="Arial" size="2">I'm just
                                    starting to debug this but figured
                                    I'd send it along in case anyone has
                                    seen this before.</font></span></p>
                              <p><span class="483463912-19112012"><font
                                    face="Arial" size="2">64bit RHEL
                                    5.0.1</font></span></p>
                              <p><span class="483463912-19112012"><font
                                    face="Arial" size="2">built from
                                    source beam.smpšR15B02</font></span></p>
                              <p><span class="483463912-19112012"><font
                                    face="Arial" size="2">Happens
                                    consistently when trying to start
                                    our app and then just stops after a
                                    time. Across a few boxes. Oddly we
                                    have an identical cluster (hw and
                                    sw) and it never happens.</font></span></p>
                            </span></div>
                        </blockquote>
                        <font size="2"><font face="Arial">Yes! I've seen
                            it before and have tried for several months
                            to get a<font size="2"><span
                                class="Apple-converted-space">š</span>reproducable
                              example and a<font size="2"><span
                                  class="Apple-converted-space">š</span></font>core

                              i can analyze here. I've had one core that
                              was<font size="2"><span
                                  class="Apple-converted-space">š</span>somewhat
                                readable but had no luck in locating the
                                beam code that triggered this. If you
                                could try narrowing it down, I would be
                                really grateful!<br>
                                <br>
                                <font size="2">Please email me any
                                  findings, theories, cores dumps<font
                                    size="2"><span
                                      class="Apple-converted-space">š</span>-
                                    anything! I really want to find
                                    this! The most interesting would be
                                    to find the snippet of erlang code
                                    that makes this happen
                                    (intermittently probably).<br>
                                    <br>
                                    <font size="2">The problem is<span
                                        class="Apple-converted-space">š</span><font
                                        size="2">that<span
                                          class="Apple-converted-space">š</span><font
                                          size="2">when the allocators
                                          crash, the error is usually
                                          somewhere else<font size="2">.</font><span
class="Apple-converted-space">š</span><font size="2">A</font>ccess of
                                          freed memory, double free or
                                          something else doing horrid
                                          things to memory. Ob<font
                                            size="2">viously none of our
                                            testsui<font size="2">tes e<font
                                                size="2">xercise this
                                                bug as<span
                                                  class="Apple-converted-space">š</span><font
                                                  size="2">neither our
                                                  debug builds, nor our
                                                  valgrind runs find it.
                                                  It happens on both SMP
                                                  and non SMP and is
                                                  always in the context
                                                  of the er<font
                                                    size="2">ts</font>_bs_append</font></font></font></font></font></font></font></font></font></font></font></font></font>,
                        so I'm pretty sure this has a connection to the
                        other users seeing the crash in the allocat<font
                          size="2">ors<font size="2">...</font></font><span
                          class="Apple-converted-space">š</span><br>
                        <br>
                        Cheers,<br>
                        Patrik<br>
                        <blockquote
cite="mid:51C6F20DC46369418387C5250127649B039D96@HZWEX2014N4.msad.ms.com"
                          type="cite">
                          <div><span lang="EN">
                              <p>#0 bf_unlink_free_block
                                (flags=<optimized out>,
                                block=0x6f00, allctr=<optimized
                                out>) at beam/erl_bestfit_alloc.c:789<br>
                                #1 bf_get_free_block (allctr=0x6824600,
                                size=304, cand_blk=0x0,
                                cand_size=<optimized out>,
                                flags=0) at beam/erl_bestfit_alloc.c:869<br>
                                #2 0x000000000045343c in mbc_alloc_block
                                (alcu_flgsp=<optimized out>,
                                blk_szp=<optimized out>,
                                size=<optimized out>,
                                allctr=<optimized out>) at
                                beam/erl_alloc_util.c:1198<br>
                                #3 mbc_alloc (allctr=0x6824600,
                                size=295) at beam/erl_alloc_util.c:1345<br>
                                #4 0x000000000045398d in
                                do_erts_alcu_alloc (type=164,
                                extra=0x6824600, size=295) at
                                beam/erl_alloc_util.c:3442<br>
                                #5 0x0000000000453a0f in
                                erts_alcu_alloc_thr_pref (type=164,
                                extra=<optimized out>, size=287)
                                at beam/erl_alloc_util.c:3520<br>
                                #6 0x0000000000511463 in erts_alloc
                                (size=287, type=<optimized out>)
                                at beam/erl_alloc.h:208<br>
                                #7 erts_bin_nrml_alloc
                                (size=<optimized out>) at
                                beam/erl_binary.h:260<br>
                                #8 erts_bs_append (c_p=0x69fba60,
                                reg=<optimized out>,
                                live=<optimized out>,
                                build_size_term=<optimized out>,
                                extra_words=0, unit=8)<span
                                  class="483463912-19112012"><span
                                    class="Apple-converted-space">š</span></span>at
                                beam/erl_bits.c:1327<br>
                                #9 0x000000000053ffd8 in process_main ()
                                at beam/beam_emu.c:3858<span
                                  class="Apple-converted-space">š</span><br>
                                #10 0x00000000004ae853 in
                                sched_thread_func (vesdp=<optimized
                                out>) at beam/erl_process.c:5184<span
                                  class="483463912-19112012"><span
                                    class="Apple-converted-space">š</span><br>
                                </span>#11 0x00000000005c17e9 in
                                thr_wrapper (vtwd=<optimized out>)
                                at pthread/ethread.c:106<span
                                  class="483463912-19112012"><span
                                    class="Apple-converted-space">š</span><br>
                                </span>#12 0x00002b430f39e73d in
                                start_thread () from
                                /lib64/libpthread.so.0<span
                                  class="483463912-19112012"><span
                                    class="Apple-converted-space">š</span><br>
                                </span>#13 0x00002b430f890f6d in clone
                                () from /lib64/libc.so.6<span
                                  class="483463912-19112012"><span
                                    class="Apple-converted-space">š</span><br>
                                </span>#14 0x0000000000000000 in ?? ()</p>
                            </span></div>
                          <br>
                          <br>
                          <hr id="HR1">
                          <br>
                          <fieldset class="mimeAttachmentHeader"></fieldset>
                          <br>
                          <pre wrap="">_______________________________________________
erlang-bugs mailing list
<a class="moz-txt-link-abbreviated" href="mailto:erlang-bugs@erlang.org" moz-do-not-send="true">erlang-bugs@erlang.org</a>
<a class="moz-txt-link-freetext" href="http://erlang.org/mailman/listinfo/erlang-bugs" moz-do-not-send="true">http://erlang.org/mailman/listinfo/erlang-bugs</a>
</pre>
                        </blockquote>
                        <br>
                        <br>
                        <br>
                        <hr id="HR1">
                        <br>
                        <div><br class="webkit-block-placeholder">
                        </div>
                        <div><br class="webkit-block-placeholder">
                        </div>
                        <div><br class="webkit-block-placeholder">
                        </div>
                        _______________________________________________<br>
                        erlang-bugs mailing list<br>
                        <a href="mailto:erlang-bugs@erlang.org"
                          moz-do-not-send="true">erlang-bugs@erlang.org</a><br>
                        <a class="moz-txt-link-freetext"
                          href="http://erlang.org/mailman/listinfo/erlang-bugs"
                          moz-do-not-send="true">http://erlang.org/mailman/listinfo/erlang-bugs</a></div>
                    </blockquote>
                  </div>
                  <br>
                </div>
              </blockquote>
              <br>
            </div>
          </blockquote>
        </div>
        <br>
      </blockquote>
      <br>
      <br>
      <br>
      <hr id="HR1">
      <br>
    </blockquote>
    <br>
  </body>
</html>