<html>
  <head>
    <meta content="text/html; charset=windows-1251"
      http-equiv="Content-Type">
  </head>
  <body bgcolor="#FFFFFF" text="#000000">
    <div class="moz-cite-prefix">Hi!<br>
      On 11/20/2012 10:40 PM, Denis Titoruk wrote:<br>
    </div>
    <blockquote
      cite="mid:79133563-669F-4FDD-8982-01DB7B321DA5@aboutecho.com"
      type="cite">
      <meta http-equiv="Content-Type" content="text/html;
        charset=windows-1251">
      <base href="x-msg://13417/">
      <div>Hi,</div>
      <div><br>
      </div>
      We've got the same error on R15B01, R15B02
      <div>I've finished my investigation of this issue today & here
        is result:</div>
      <div><br>
      </div>
      <div>Let's assume we have the code:</div>
      <div>encode_formats(Columns) -><br>
           encode_formats(Columns, 0, <<>>).<br>
        <br>
        encode_formats([], Count, Acc) -><br>
           <<Count:?int16, Acc/binary>>;<br>
        <br>
        encode_formats([#column{format = Format} | T], Count, Acc) -><br>
           encode_formats(T, Count + 1, <<Acc/binary,
        Format:?int16>>).<br>
      </div>
      <div><br>
      </div>
      <div>So, <<Acc/binary, Format:?int16>> translates to</div>
      <div><br>
      </div>
      <div> 
         {bs_append,{f,0},{integer,16},0,7,8,{x,2},{field_flags,[]},{x,1}}.<br>
   {bs_put_integer,{f,0},{integer,16},1,{field_flags,[signed,big]},{x,6}}.<br>
      </div>
      <div><br>
      </div>
      <div>There is GC execution in bs_append and it can reallocate
        binary but there isn't reassigning erts_current_bin which used
        in bs_put_integer.</div>
      <div><br>
      </div>
      <div>Fix:</div>
      <div><br>
      </div>
      <div>erl_bits.c:<br>
        Eterm<br>
        erts_bs_append(Process* c_p, Eterm* reg, Uint live, Eterm
        build_size_term,<br>
               Uint extra_words, Uint unit)<br>
        …<br>
           if (c_p->stop - c_p->htop < heap_need) {<br>
               (void) erts_garbage_collect(c_p, heap_need, reg, live+1);<br>
           }<br>
           sb = (ErlSubBin *) c_p->htop;<br>
           c_p->htop += ERL_SUB_BIN_SIZE;<br>
           sb->thing_word = HEADER_SUB_BIN;<br>
           sb->size = BYTE_OFFSET(used_size_in_bits);<br>
           sb->bitsize = BIT_OFFSET(used_size_in_bits);<br>
           sb->offs = 0;<br>
           sb->bitoffs = 0;<br>
           sb->is_writable = 1;<br>
           sb->orig = reg[live];<br>
        <br>
///////////////////////////////////////////////////////////////////</div>
      <div>// add this lines</div>
      <div>
        <div>///////////////////////////////////////////////////////////////////</div>
      </div>
      <div>   pb = (ProcBin *) boxed_val(sb->orig);</div>
      <div>   erts_current_bin = pb->bytes;<br>
           erts_writable_bin = 1;<br>
///////////////////////////////////////////////////////////////////<br>
        <br>
           return make_binary(sb);<br>
        …<br>
      </div>
      <div><br>
      </div>
    </blockquote>
    Can you reproduce the bug and verify that this fix really works? The
    thing is that binaries should *only* be reallocated in the gc if
    there are no active writers, which there obviously is here (   
    pb->flags |= PB_ACTIVE_WRITER  a few lines earlier), so the bug
    would be in the detection of active writers in the gc if this code
    change actually removes the crash.<br>
     <br>
    <blockquote
      cite="mid:79133563-669F-4FDD-8982-01DB7B321DA5@aboutecho.com"
      type="cite">
      <div><br>
      </div>
      <div>--</div>
      <div>Cheers,</div>
      <div>Denis</div>
    </blockquote>
    Cheers,<br>
    /Patrik<br>
    <blockquote
      cite="mid:79133563-669F-4FDD-8982-01DB7B321DA5@aboutecho.com"
      type="cite">
      <div><br>
        <div>
          <div>20.11.2012, â 19:37, Musumeci, Antonio S íàïèñàë(à):</div>
          <br class="Apple-interchange-newline">
          <blockquote type="cite"><span class="Apple-style-span"
              style="border-collapse: separate; font-family: Helvetica;
              font-style: normal; font-variant: normal; font-weight:
              normal; letter-spacing: normal; line-height: normal;
              orphans: 2; text-align: -webkit-auto; text-indent: 0px;
              text-transform: none; white-space: normal; widows: 2;
              word-spacing: 0px; -webkit-border-horizontal-spacing: 0px;
              -webkit-border-vertical-spacing: 0px;
              -webkit-text-decorations-in-effect: none;
              -webkit-text-size-adjust: auto; -webkit-text-stroke-width:
              0px; font-size: medium; ">
              <div text="#000000" bgcolor="#ffffff">
                <div><br class="webkit-block-placeholder">
                </div>
                <div dir="ltr" align="left"><font color="#0000ff"
                    face="Arial" size="2">
                    <p align="left"><font color="#0000ff" face="Arial"
                        size="2"><font color="#0000ff" face="Arial"
                          size="2"><font color="#0000ff" face="Arial"
                            size="2">I've got lots of cores... but they
                            are all from optimized builds.</font></font></font></p>
                    <font color="#0000ff" face="Arial" size="2"><font
                        color="#0000ff" face="Arial" size="2"><font
                          color="#0000ff" face="Arial" size="2">
                          <p dir="ltr" align="left">Has this been seen
                            in other versions? We are keen to solve this
                            because it's causing us pain in production.
                            We hit another, older, memory bug (the 32bit
                            values used in 64bit build)... and now this.</p>
                        </font></font></font>
                    <p dir="ltr" align="left"><font color="#0000ff"
                        face="Arial" size="2"><font color="#0000ff"
                          face="Arial" size="2"><font color="#0000ff"
                            face="Arial" size="2">I'm going to be
                            building and trying R15B01 to see if we hit
                            it as well. I'll send any additional
                            information I can.</font></font></font><font
                        color="#000000" face="Times New Roman" size="3"> <span
                          class="403263615-20112012"><font
                            color="#0000ff" face="Arial" size="2">Any
                            suggestions on debugging beam would be
                            appreciated. Compile options, etc.</font></span></font></p>
                    <p dir="ltr" align="left">Thanks.</p>
                  </font>
                  <p dir="ltr" align="left"><font color="#0000ff"
                      face="Arial" size="2"><font color="#0000ff"
                        face="Arial" size="2"><font color="#0000ff"
                          face="Arial" size="2"><font color="#0000ff"
                            face="Arial" size="2">-antonio</font></font></font></font><br>
                  </p>
                </div>
                <div class="OutlookMessageHeader" dir="ltr" align="left"
                  lang="en-us">
                  <hr tabindex="-1"><font face="Tahoma" size="2"><b>From:</b><span
                      class="Apple-converted-space"> </span><a
                      moz-do-not-send="true"
                      href="mailto:erlang-bugs-bounces@erlang.org">erlang-bugs-bounces@erlang.org</a><span
                      class="Apple-converted-space"> </span>[<a class="moz-txt-link-freetext" href="mailto:erlang-bugs-bounces@erlang.org">mailto:erlang-bugs-bounces@erlang.org</a>]<span
                      class="Apple-converted-space"> </span><b>On Behalf
                      Of<span class="Apple-converted-space"> </span></b>Patrik
                    Nyblom<br>
                    <b>Sent:</b><span class="Apple-converted-space"> </span>Monday,
                    November 19, 2012 8:55 AM<br>
                    <b>To:</b><span class="Apple-converted-space"> </span><a
                      moz-do-not-send="true"
                      href="mailto:erlang-bugs@erlang.org">erlang-bugs@erlang.org</a><br>
                    <b>Subject:</b><span class="Apple-converted-space"> </span>Re:
                    [erlang-bugs] beam core'ing<br>
                  </font><br>
                </div>
                <div class="moz-cite-prefix">On 11/19/2012 02:01 PM,
                  Musumeci, Antonio S wrote:<br>
                </div>
                <blockquote
cite="mid:51C6F20DC46369418387C5250127649B039D96@HZWEX2014N4.msad.ms.com"
                  type="cite">
                  <div><br class="webkit-block-placeholder">
                  </div>
                  <div><span lang="EN">
                      <p><span class="483463912-19112012"><font
                            face="Arial" size="2">I'm just starting to
                            debug this but figured I'd send it along in
                            case anyone has seen this before.</font></span></p>
                      <p><span class="483463912-19112012"><font
                            face="Arial" size="2">64bit RHEL 5.0.1</font></span></p>
                      <p><span class="483463912-19112012"><font
                            face="Arial" size="2">built from source
                            beam.smp R15B02</font></span></p>
                      <p><span class="483463912-19112012"><font
                            face="Arial" size="2">Happens consistently
                            when trying to start our app and then just
                            stops after a time. Across a few boxes.
                            Oddly we have an identical cluster (hw and
                            sw) and it never happens.</font></span></p>
                    </span></div>
                </blockquote>
                <font size="2"><font face="Arial">Yes! I've seen it
                    before and have tried for several months to get a<font
                      size="2"><span class="Apple-converted-space"> </span>reproducable
                      example and a<font size="2"><span
                          class="Apple-converted-space"> </span></font>core
                      i can analyze here. I've had one core that was<font
                        size="2"><span class="Apple-converted-space"> </span>somewhat
                        readable but had no luck in locating the beam
                        code that triggered this. If you could try
                        narrowing it down, I would be really grateful!<br>
                        <br>
                        <font size="2">Please email me any findings,
                          theories, cores dumps<font size="2"><span
                              class="Apple-converted-space"> </span>-
                            anything! I really want to find this! The
                            most interesting would be to find the
                            snippet of erlang code that makes this
                            happen (intermittently probably).<br>
                            <br>
                            <font size="2">The problem is<span
                                class="Apple-converted-space"> </span><font
                                size="2">that<span
                                  class="Apple-converted-space"> </span><font
                                  size="2">when the allocators crash,
                                  the error is usually somewhere else<font
                                    size="2">.</font><span
                                    class="Apple-converted-space"> </span><font
                                    size="2">A</font>ccess of freed
                                  memory, double free or something else
                                  doing horrid things to memory. Ob<font
                                    size="2">viously none of our testsui<font
                                      size="2">tes e<font size="2">xercise
                                        this bug as<span
                                          class="Apple-converted-space"> </span><font
                                          size="2">neither our debug
                                          builds, nor our valgrind runs
                                          find it. It happens on both
                                          SMP and non SMP and is always
                                          in the context of the er<font
                                            size="2">ts</font>_bs_append</font></font></font></font></font></font></font></font></font></font></font></font></font>,
                so I'm pretty sure this has a connection to the other
                users seeing the crash in the allocat<font size="2">ors<font
                    size="2">...</font></font><span
                  class="Apple-converted-space"> </span><br>
                <br>
                Cheers,<br>
                Patrik<br>
                <blockquote
cite="mid:51C6F20DC46369418387C5250127649B039D96@HZWEX2014N4.msad.ms.com"
                  type="cite">
                  <div><span lang="EN">
                      <p>#0 bf_unlink_free_block (flags=<optimized
                        out>, block=0x6f00, allctr=<optimized
                        out>) at beam/erl_bestfit_alloc.c:789<br>
                        #1 bf_get_free_block (allctr=0x6824600,
                        size=304, cand_blk=0x0, cand_size=<optimized
                        out>, flags=0) at
                        beam/erl_bestfit_alloc.c:869<br>
                        #2 0x000000000045343c in mbc_alloc_block
                        (alcu_flgsp=<optimized out>,
                        blk_szp=<optimized out>,
                        size=<optimized out>, allctr=<optimized
                        out>) at beam/erl_alloc_util.c:1198<br>
                        #3 mbc_alloc (allctr=0x6824600, size=295) at
                        beam/erl_alloc_util.c:1345<br>
                        #4 0x000000000045398d in do_erts_alcu_alloc
                        (type=164, extra=0x6824600, size=295) at
                        beam/erl_alloc_util.c:3442<br>
                        #5 0x0000000000453a0f in
                        erts_alcu_alloc_thr_pref (type=164,
                        extra=<optimized out>, size=287) at
                        beam/erl_alloc_util.c:3520<br>
                        #6 0x0000000000511463 in erts_alloc (size=287,
                        type=<optimized out>) at
                        beam/erl_alloc.h:208<br>
                        #7 erts_bin_nrml_alloc (size=<optimized
                        out>) at beam/erl_binary.h:260<br>
                        #8 erts_bs_append (c_p=0x69fba60,
                        reg=<optimized out>, live=<optimized
                        out>, build_size_term=<optimized out>,
                        extra_words=0, unit=8)<span
                          class="483463912-19112012"><span
                            class="Apple-converted-space"> </span></span>at
                        beam/erl_bits.c:1327<br>
                        #9 0x000000000053ffd8 in process_main () at
                        beam/beam_emu.c:3858<span
                          class="Apple-converted-space"> </span><br>
                        #10 0x00000000004ae853 in sched_thread_func
                        (vesdp=<optimized out>) at
                        beam/erl_process.c:5184<span
                          class="483463912-19112012"><span
                            class="Apple-converted-space"> </span><br>
                        </span>#11 0x00000000005c17e9 in thr_wrapper
                        (vtwd=<optimized out>) at
                        pthread/ethread.c:106<span
                          class="483463912-19112012"><span
                            class="Apple-converted-space"> </span><br>
                        </span>#12 0x00002b430f39e73d in start_thread ()
                        from /lib64/libpthread.so.0<span
                          class="483463912-19112012"><span
                            class="Apple-converted-space"> </span><br>
                        </span>#13 0x00002b430f890f6d in clone () from
                        /lib64/libc.so.6<span class="483463912-19112012"><span
                            class="Apple-converted-space"> </span><br>
                        </span>#14 0x0000000000000000 in ?? ()</p>
                    </span></div>
                  <br>
                  <br>
                  <hr id="HR1"><br>
                  <br>
                  <fieldset class="mimeAttachmentHeader"></fieldset>
                  <br>
                  <pre wrap="">_______________________________________________
erlang-bugs mailing list
<a moz-do-not-send="true" class="moz-txt-link-abbreviated" href="mailto:erlang-bugs@erlang.org">erlang-bugs@erlang.org</a>
<a moz-do-not-send="true" class="moz-txt-link-freetext" href="http://erlang.org/mailman/listinfo/erlang-bugs">http://erlang.org/mailman/listinfo/erlang-bugs</a>
</pre>
                </blockquote>
                <br>
                <br>
                <br>
                <hr id="HR1"><br>
                <div><br class="webkit-block-placeholder">
                </div>
                <div><br class="webkit-block-placeholder">
                </div>
                <div><br class="webkit-block-placeholder">
                </div>
                _______________________________________________<br>
                erlang-bugs mailing list<br>
                <a moz-do-not-send="true"
                  href="mailto:erlang-bugs@erlang.org">erlang-bugs@erlang.org</a><br>
                <a class="moz-txt-link-freetext" href="http://erlang.org/mailman/listinfo/erlang-bugs">http://erlang.org/mailman/listinfo/erlang-bugs</a></div>
            </span></blockquote>
        </div>
        <br>
      </div>
    </blockquote>
    <br>
  </body>
</html>