<html>
  <head>
    <meta content="text/html; charset=windows-1251"
      http-equiv="Content-Type">
  </head>
  <body bgcolor="#FFFFFF" text="#000000">
    <div class="moz-cite-prefix">Hi again :)<br>
      <br>
      Another thing that would be helpful is if you could create a crash
      dump instead of a fprintf when the binary is wrongly moved, i.e.
      call erl_exit(ERTS_DUMP_EXIT, "erts_current_bin !=
      (pb->bytes)"); instead of the fprintf? Then you could isolate
      the erlang code snippet that exercises the bug and I maybe could
      create a smaller testcase... A simple testcase when diving into
      the GC would be really helpful :)<br>
      <br>
      Cheers,<br>
      /Patrik<br>
      <br>
      On 11/21/2012 11:21 AM, Denis Titoruk wrote:<br>
    </div>
    <blockquote
      cite="mid:8436D993-C4FC-4822-B0E8-7A2D6AB2E0C9@gmail.com"
      type="cite">
      <meta http-equiv="Content-Type" content="text/html;
        charset=windows-1251">
      <br>
      <div>
        <div>21.11.2012, â 13:44, Patrik Nyblom íàïèñàë(à):</div>
        <br class="Apple-interchange-newline">
        <blockquote type="cite">
          <div bgcolor="#FFFFFF" text="#000000">
            <div class="moz-cite-prefix">Hi!<br>
              On 11/20/2012 10:40 PM, Denis Titoruk wrote:<br>
            </div>
            <blockquote
              cite="mid:79133563-669F-4FDD-8982-01DB7B321DA5@aboutecho.com"
              type="cite"> <base href="x-msg://13417/">
              <div>Hi,</div>
              <div><br>
              </div>
              We've got the same error on R15B01, R15B02
              <div>I've finished my investigation of this issue today
                & here is result:</div>
              <div><br>
              </div>
              <div>Let's assume we have the code:</div>
              <div>encode_formats(Columns) -><br>
                   encode_formats(Columns, 0, <<>>).<br>
                <br>
                encode_formats([], Count, Acc) -><br>
                   <<Count:?int16, Acc/binary>>;<br>
                <br>
                encode_formats([#column{format = Format} | T], Count,
                Acc) -><br>
                   encode_formats(T, Count + 1, <<Acc/binary,
                Format:?int16>>).<br>
              </div>
              <div><br>
              </div>
              <div>So, <<Acc/binary,
                Format:?int16>> translates to</div>
              <div><br>
              </div>
              <div> 
                 {bs_append,{f,0},{integer,16},0,7,8,{x,2},{field_flags,[]},{x,1}}.<br>
   {bs_put_integer,{f,0},{integer,16},1,{field_flags,[signed,big]},{x,6}}.<br>
              </div>
              <div><br>
              </div>
              <div>There is GC execution in bs_append and it can
                reallocate binary but there isn't
                reassigning erts_current_bin which used
                in bs_put_integer.</div>
              <div><br>
              </div>
              <div>Fix:</div>
              <div><br>
              </div>
              <div>erl_bits.c:<br>
                Eterm<br>
                erts_bs_append(Process* c_p, Eterm* reg, Uint live,
                Eterm build_size_term,<br>
                       Uint extra_words, Uint unit)<br>
                …<br>
                   if (c_p->stop - c_p->htop < heap_need) {<br>
                       (void) erts_garbage_collect(c_p, heap_need, reg,
                live+1);<br>
                   }<br>
                   sb = (ErlSubBin *) c_p->htop;<br>
                   c_p->htop += ERL_SUB_BIN_SIZE;<br>
                   sb->thing_word = HEADER_SUB_BIN;<br>
                   sb->size = BYTE_OFFSET(used_size_in_bits);<br>
                   sb->bitsize = BIT_OFFSET(used_size_in_bits);<br>
                   sb->offs = 0;<br>
                   sb->bitoffs = 0;<br>
                   sb->is_writable = 1;<br>
                   sb->orig = reg[live];<br>
                <br>
///////////////////////////////////////////////////////////////////</div>
              <div>// add this lines</div>
              <div>
                <div>///////////////////////////////////////////////////////////////////</div>
              </div>
              <div>   pb = (ProcBin *) boxed_val(sb->orig);</div>
              <div>   erts_current_bin = pb->bytes;<br>
                   erts_writable_bin = 1;<br>
///////////////////////////////////////////////////////////////////<br>
                <br>
                   return make_binary(sb);<br>
                …<br>
              </div>
              <div><br>
              </div>
            </blockquote>
            Can you reproduce the bug and verify that this fix really
            works? The thing is that binaries should *only* be
            reallocated in the gc if there are no active writers, which
            there obviously is here (    pb->flags |=
            PB_ACTIVE_WRITER  a few lines earlier), so the bug would be
            in the detection of active writers in the gc if this code
            change actually removes the crash.<br>
          </div>
        </blockquote>
        <div><br>
        </div>
        <div>Yes, it works in my case. I haven't simple test case for
          reproducing this bug (actually I run few processes to send
          requests to pgsql)</div>
        <div><br>
        </div>
        <div>
          <div>    pb = (ProcBin *) boxed_val(sb->orig);</div>
          <div>    if (erts_current_bin != (pb->bytes)) {</div>
          <div>        fprintf(stderr, "erts_current_bin !=
            (pb->bytes)\n");</div>
          <div>        fflush(stderr);</div>
          <div>    }</div>
          <div>    erts_current_bin = pb->bytes;</div>
          <div>    erts_writable_bin = 1;</div>
        </div>
        <div><br>
        </div>
        <div><br>
        </div>
        <div>
          <div>(jskit@siden)1> f(F), F = fun() ->
            postgresql:equery('echo-customers', write, <<"some
            query here">>, []) end.</div>
          <div>#Fun<erl_eval.20.82930912></div>
          <div>(jskit@siden)2> perftest:comprehensive(1000, F).</div>
          <div>Sequential 100 cycles in ~1 seconds (100 cycles/s)</div>
          <div>Sequential 200 cycles in ~2 seconds (106 cycles/s)</div>
          <div>Sequential 1000 cycles in ~12 seconds (85 cycles/s)</div>
          <div>Parallel 2 1000 cycles in ~8 seconds (132 cycles/s)</div>
          <div>Parallel 4 1000 cycles in ~8 seconds (121 cycles/s)</div>
          <div>Parallel 10 1000 cycles in ~8 seconds (119 cycles/s)</div>
          <div>Parallel 100 1000 cycles in ~13 seconds (74 cycles/s)</div>
          <div>[85,132,121,119,74]</div>
          <div>(jskit@siden)3> perftest:comprehensive(1000, F).      
                                     </div>
          <div>Sequential 100 cycles in ~1 seconds (83 cycles/s)        
                                  </div>
          <div>Sequential 200 cycles in ~2 seconds (83 cycles/s)        
                                  </div>
          <div>Sequential 1000 cycles in ~14 seconds (71 cycles/s)      
                                  </div>
          <div>Parallel 2 1000 cycles in ~11 seconds (95 cycles/s)      
                                  </div>
          <div>Parallel 4 1000 cycles in ~10 seconds (105 cycles/s)    
                                   </div>
          <div>Parallel 10 1000 cycles in ~11 seconds (91 cycles/s)</div>
          <div>Parallel 100 1000 cycles in ~13 seconds (76 cycles/s)</div>
          <div>"G_i[L"</div>
          <div>(jskit@siden)4> perftest:comprehensive(1000, F).</div>
          <div>Sequential 100 cycles in ~1 seconds (88 cycles/s)</div>
          <div>Sequential 200 cycles in ~2 seconds (85 cycles/s)</div>
          <div>Sequential 1000 cycles in ~13 seconds (74 cycles/s)</div>
          <div>Parallel 2 1000 cycles in ~9 seconds (109 cycles/s)</div>
          <div>Parallel 4 1000 cycles in ~10 seconds (101 cycles/s)</div>
          <div>Parallel 10 1000 cycles in ~11 seconds (95 cycles/s)</div>
          <div>erts_current_bin != (pb->bytes)</div>
          <div>Parallel 100 1000 cycles in ~13 seconds (77 cycles/s)</div>
          <div>"Jme_M"</div>
        </div>
        <br>
        <blockquote type="cite">
          <div bgcolor="#FFFFFF" text="#000000">  <br>
            <blockquote
              cite="mid:79133563-669F-4FDD-8982-01DB7B321DA5@aboutecho.com"
              type="cite">
              <div><br>
              </div>
              <div>--</div>
              <div>Cheers,</div>
              <div>Denis</div>
            </blockquote>
            Cheers,<br>
            /Patrik<br>
            <blockquote
              cite="mid:79133563-669F-4FDD-8982-01DB7B321DA5@aboutecho.com"
              type="cite">
              <div><br>
                <div>
                  <div>20.11.2012, â 19:37, Musumeci, Antonio S
                    íàïèñàë(à):</div>
                  <br class="Apple-interchange-newline">
                  <blockquote type="cite">
                    <div text="#000000" bgcolor="#ffffff">
                      <div><br class="webkit-block-placeholder">
                      </div>
                      <div dir="ltr" align="left"><font color="#0000ff"
                          face="Arial" size="2">
                          <p align="left"><font color="#0000ff"
                              face="Arial" size="2"><font
                                color="#0000ff" face="Arial" size="2"><font
                                  color="#0000ff" face="Arial" size="2">I've
                                  got lots of cores... but they are all
                                  from optimized builds.</font></font></font></p>
                          <font color="#0000ff" face="Arial" size="2"><font
                              color="#0000ff" face="Arial" size="2"><font
                                color="#0000ff" face="Arial" size="2">
                                <p dir="ltr" align="left">Has this been
                                  seen in other versions? We are keen to
                                  solve this because it's causing us
                                  pain in production. We hit another,
                                  older, memory bug (the 32bit values
                                  used in 64bit build)... and now this.</p>
                              </font></font></font>
                          <p dir="ltr" align="left"><font
                              color="#0000ff" face="Arial" size="2"><font
                                color="#0000ff" face="Arial" size="2"><font
                                  color="#0000ff" face="Arial" size="2">I'm
                                  going to be building and trying R15B01
                                  to see if we hit it as well. I'll send
                                  any additional information I can.</font></font></font><font
                              color="#000000" face="Times New Roman"
                              size="3"> <span class="403263615-20112012"><font
                                  color="#0000ff" face="Arial" size="2">Any

                                  suggestions on debugging beam would be
                                  appreciated. Compile options, etc.</font></span></font></p>
                          <p dir="ltr" align="left">Thanks.</p>
                        </font>
                        <p dir="ltr" align="left"><font color="#0000ff"
                            face="Arial" size="2"><font color="#0000ff"
                              face="Arial" size="2"><font
                                color="#0000ff" face="Arial" size="2"><font
                                  color="#0000ff" face="Arial" size="2">-antonio</font></font></font></font><br>
                        </p>
                      </div>
                      <div class="OutlookMessageHeader" dir="ltr"
                        align="left" lang="en-us">
                        <hr tabindex="-1"><font face="Tahoma" size="2"><b>From:</b><span
                            class="Apple-converted-space"> </span><a
                            moz-do-not-send="true"
                            href="mailto:erlang-bugs-bounces@erlang.org">erlang-bugs-bounces@erlang.org</a><span
                            class="Apple-converted-space"> </span>[<a
                            moz-do-not-send="true"
                            class="moz-txt-link-freetext"
                            href="mailto:erlang-bugs-bounces@erlang.org">mailto:erlang-bugs-bounces@erlang.org</a>]<span
                            class="Apple-converted-space"> </span><b>On
                            Behalf Of<span class="Apple-converted-space"> </span></b>Patrik

                          Nyblom<br>
                          <b>Sent:</b><span
                            class="Apple-converted-space"> </span>Monday,

                          November 19, 2012 8:55 AM<br>
                          <b>To:</b><span class="Apple-converted-space"> </span><a
                            moz-do-not-send="true"
                            href="mailto:erlang-bugs@erlang.org">erlang-bugs@erlang.org</a><br>
                          <b>Subject:</b><span
                            class="Apple-converted-space"> </span>Re:
                          [erlang-bugs] beam core'ing<br>
                        </font><br>
                      </div>
                      <div class="moz-cite-prefix">On 11/19/2012 02:01
                        PM, Musumeci, Antonio S wrote:<br>
                      </div>
                      <blockquote
cite="mid:51C6F20DC46369418387C5250127649B039D96@HZWEX2014N4.msad.ms.com"
                        type="cite">
                        <div><br class="webkit-block-placeholder">
                        </div>
                        <div><span lang="EN">
                            <p><span class="483463912-19112012"><font
                                  face="Arial" size="2">I'm just
                                  starting to debug this but figured I'd
                                  send it along in case anyone has seen
                                  this before.</font></span></p>
                            <p><span class="483463912-19112012"><font
                                  face="Arial" size="2">64bit RHEL 5.0.1</font></span></p>
                            <p><span class="483463912-19112012"><font
                                  face="Arial" size="2">built from
                                  source beam.smp R15B02</font></span></p>
                            <p><span class="483463912-19112012"><font
                                  face="Arial" size="2">Happens
                                  consistently when trying to start our
                                  app and then just stops after a time.
                                  Across a few boxes. Oddly we have an
                                  identical cluster (hw and sw) and it
                                  never happens.</font></span></p>
                          </span></div>
                      </blockquote>
                      <font size="2"><font face="Arial">Yes! I've seen
                          it before and have tried for several months to
                          get a<font size="2"><span
                              class="Apple-converted-space"> </span>reproducable

                            example and a<font size="2"><span
                                class="Apple-converted-space"> </span></font>core

                            i can analyze here. I've had one core that
                            was<font size="2"><span
                                class="Apple-converted-space"> </span>somewhat

                              readable but had no luck in locating the
                              beam code that triggered this. If you
                              could try narrowing it down, I would be
                              really grateful!<br>
                              <br>
                              <font size="2">Please email me any
                                findings, theories, cores dumps<font
                                  size="2"><span
                                    class="Apple-converted-space"> </span>-
                                  anything! I really want to find this!
                                  The most interesting would be to find
                                  the snippet of erlang code that makes
                                  this happen (intermittently probably).<br>
                                  <br>
                                  <font size="2">The problem is<span
                                      class="Apple-converted-space"> </span><font
                                      size="2">that<span
                                        class="Apple-converted-space"> </span><font
                                        size="2">when the allocators
                                        crash, the error is usually
                                        somewhere else<font size="2">.</font><span
                                          class="Apple-converted-space"> </span><font
                                          size="2">A</font>ccess of
                                        freed memory, double free or
                                        something else doing horrid
                                        things to memory. Ob<font
                                          size="2">viously none of our
                                          testsui<font size="2">tes e<font
                                              size="2">xercise this bug
                                              as<span
                                                class="Apple-converted-space"> </span><font
                                                size="2">neither our
                                                debug builds, nor our
                                                valgrind runs find it.
                                                It happens on both SMP
                                                and non SMP and is
                                                always in the context of
                                                the er<font size="2">ts</font>_bs_append</font></font></font></font></font></font></font></font></font></font></font></font></font>,
                      so I'm pretty sure this has a connection to the
                      other users seeing the crash in the allocat<font
                        size="2">ors<font size="2">...</font></font><span
                        class="Apple-converted-space"> </span><br>
                      <br>
                      Cheers,<br>
                      Patrik<br>
                      <blockquote
cite="mid:51C6F20DC46369418387C5250127649B039D96@HZWEX2014N4.msad.ms.com"
                        type="cite">
                        <div><span lang="EN">
                            <p>#0 bf_unlink_free_block
                              (flags=<optimized out>,
                              block=0x6f00, allctr=<optimized
                              out>) at beam/erl_bestfit_alloc.c:789<br>
                              #1 bf_get_free_block (allctr=0x6824600,
                              size=304, cand_blk=0x0,
                              cand_size=<optimized out>, flags=0)
                              at beam/erl_bestfit_alloc.c:869<br>
                              #2 0x000000000045343c in mbc_alloc_block
                              (alcu_flgsp=<optimized out>,
                              blk_szp=<optimized out>,
                              size=<optimized out>,
                              allctr=<optimized out>) at
                              beam/erl_alloc_util.c:1198<br>
                              #3 mbc_alloc (allctr=0x6824600, size=295)
                              at beam/erl_alloc_util.c:1345<br>
                              #4 0x000000000045398d in
                              do_erts_alcu_alloc (type=164,
                              extra=0x6824600, size=295) at
                              beam/erl_alloc_util.c:3442<br>
                              #5 0x0000000000453a0f in
                              erts_alcu_alloc_thr_pref (type=164,
                              extra=<optimized out>, size=287) at
                              beam/erl_alloc_util.c:3520<br>
                              #6 0x0000000000511463 in erts_alloc
                              (size=287, type=<optimized out>) at
                              beam/erl_alloc.h:208<br>
                              #7 erts_bin_nrml_alloc (size=<optimized
                              out>) at beam/erl_binary.h:260<br>
                              #8 erts_bs_append (c_p=0x69fba60,
                              reg=<optimized out>,
                              live=<optimized out>,
                              build_size_term=<optimized out>,
                              extra_words=0, unit=8)<span
                                class="483463912-19112012"><span
                                  class="Apple-converted-space"> </span></span>at

                              beam/erl_bits.c:1327<br>
                              #9 0x000000000053ffd8 in process_main ()
                              at beam/beam_emu.c:3858<span
                                class="Apple-converted-space"> </span><br>
                              #10 0x00000000004ae853 in
                              sched_thread_func (vesdp=<optimized
                              out>) at beam/erl_process.c:5184<span
                                class="483463912-19112012"><span
                                  class="Apple-converted-space"> </span><br>
                              </span>#11 0x00000000005c17e9 in
                              thr_wrapper (vtwd=<optimized out>)
                              at pthread/ethread.c:106<span
                                class="483463912-19112012"><span
                                  class="Apple-converted-space"> </span><br>
                              </span>#12 0x00002b430f39e73d in
                              start_thread () from
                              /lib64/libpthread.so.0<span
                                class="483463912-19112012"><span
                                  class="Apple-converted-space"> </span><br>
                              </span>#13 0x00002b430f890f6d in clone ()
                              from /lib64/libc.so.6<span
                                class="483463912-19112012"><span
                                  class="Apple-converted-space"> </span><br>
                              </span>#14 0x0000000000000000 in ?? ()</p>
                          </span></div>
                        <br>
                        <br>
                        <hr id="HR1"><br>
                        <fieldset class="mimeAttachmentHeader"></fieldset>
                        <br>
                        <pre wrap="">_______________________________________________
erlang-bugs mailing list
<a moz-do-not-send="true" class="moz-txt-link-abbreviated" href="mailto:erlang-bugs@erlang.org">erlang-bugs@erlang.org</a>
<a moz-do-not-send="true" class="moz-txt-link-freetext" href="http://erlang.org/mailman/listinfo/erlang-bugs">http://erlang.org/mailman/listinfo/erlang-bugs</a>
</pre>
                      </blockquote>
                      <br>
                      <br>
                      <br>
                      <hr id="HR1"><br>
                      <div><br class="webkit-block-placeholder">
                      </div>
                      <div><br class="webkit-block-placeholder">
                      </div>
                      <div><br class="webkit-block-placeholder">
                      </div>
                      _______________________________________________<br>
                      erlang-bugs mailing list<br>
                      <a moz-do-not-send="true"
                        href="mailto:erlang-bugs@erlang.org">erlang-bugs@erlang.org</a><br>
                      <a moz-do-not-send="true"
                        class="moz-txt-link-freetext"
                        href="http://erlang.org/mailman/listinfo/erlang-bugs">http://erlang.org/mailman/listinfo/erlang-bugs</a></div>
                  </blockquote>
                </div>
                <br>
              </div>
            </blockquote>
            <br>
          </div>
        </blockquote>
      </div>
      <br>
    </blockquote>
    <br>
  </body>
</html>