<html>
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
  </head>
  <body text="#000000" bgcolor="#FFFFFF">
    <p>Hi John,</p>
    <p>gc tracer(native) <b>1,000,000</b> steps:<br>
    </p>
    <pre style="background-color:#ffffff;color:#000000;font-family:'Menlo';font-size:18.0pt;">#{gc_major_end => 36,gc_major_start => 36,gc_max_heap_size => 0,gc_minor_end => 116,gc_minor_start => 116}

gc tracer(<b>NOT</b> native) <b>1,000</b> steps:</pre>
    <div class="moz-cite-prefix">
      <pre style="background-color:#ffffff;color:#000000;font-family:'Menlo';font-size:18.0pt;">#{gc_major_end => 35,gc_major_start => 35,gc_max_heap_size => 0,gc_minor_end => 1202,gc_minor_start => 1202}

Oliver
</pre>
    </div>
    <div class="moz-cite-prefix"><br>
    </div>
    <div class="moz-cite-prefix"><br>
    </div>
    <div class="moz-cite-prefix">On 27.03.19 17:22, John Högberg wrote:<br>
    </div>
    <blockquote type="cite"
      cite="mid:5c1fa977cb2df1c976a622ea17c3105254c5de7f.camel@erlang.org">
      <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
      <div>Nevermind, I'm blind, I just noticed "UseProf" now. I may
        need more coffee :o)</div>
      <div><br>
      </div>
      <div>It's possible that the native code generates less garbage on
        the heap, causing fewer GCs, which will be a lot faster if your
        processes have a lot of live data as it won't have to copy it
        over and over. Try comparing how many garbage collections the
        process has gone through with <font size="3" face="monospace">process_info(Pid,
          garbage_collection)</font>, maybe it will provide some clue.</div>
      <div><br>
      </div>
      <div>/John</div>
      <div><br>
      </div>
      <div>On Wed, 2019-03-27 at 16:31 +0100, John Högberg wrote:</div>
      <blockquote type="cite" style="margin:0 0 0 .8ex; border-left:2px
        #729fcf solid;padding-left:1ex">
        <div>Hi Oliver,</div>
        <div><br>
        </div>
        <div>Have you tried comparing performance without eprof?</div>
        <div><br>
        </div>
        <div>eprof uses tracing to figure out which functions take a
          long time to run, which adds considerable overhead to small
          functions that are repeated extremely often. HiPE doesn't
          support tracing at all, so that overhead simply disappears
          when the module is native-compiled.</div>
        <div><br>
        </div>
        <div>Regards,</div>
        <div>John Högberg</div>
        <div><br>
        </div>
        <div>On Wed, 2019-03-27 at 16:18 +0100, Oliver Bollmann wrote:</div>
        <blockquote type="cite" style="margin:0 0 0 .8ex;
          border-left:2px #729fcf solid;padding-left:1ex">
          <p>Hi John,</p>
          <p>indeed, on standalone the factor is about 3.7 only :-(</p>
          <p>Attached the module i used. The code is part of: <a
              class="moz-txt-link-freetext"
              href="https://gitlab.com/Project-FiFo/DalmatinerDB/bitmap"
              moz-do-not-send="true">https://gitlab.com/Project-FiFo/DalmatinerDB/bitmap</a></p>
          <p>I wonder, where comes the boost?</p>
          <p>Facts: OS OSX 10.14.3(64GB)<br>
                      Erlang 20.3.18, <br>
                      the "boost" module use a lot of process directory
            (about 10GB, almost of this are binaries!)<br>
          </p>
          <p>Any hints?</p>
          <p>Oliver<br>
          </p>
          <div class="moz-cite-prefix">On 27.03.19 13:04, John Högberg
            wrote:<br>
          </div>
          <blockquote type="cite"
            cite="mid:279725e33f4b420fd95841dbe69378d5919fba50.camel@erlang.org"
            style="margin:0 0 0 .8ex; border-left:2px #729fcf
            solid;padding-left:1ex">
            <div>Hi Oliver,</div>
            <div><br>
            </div>
            <div>I've tried to reproduce this discrepancy on my machine,
              but I can only see a modest difference on latest OTP 21
              (the results are in microseconds):</div>
            <div><br>
            </div>
            <pre>Erlang/OTP 21 [erts-10.3.1] [source] [64-bit] [smp:24:24] [ds:24:24:10] [async-threads:1] [hipe]</pre>
            <pre>Eshell V10.3.1  (abort with ^G)</pre>
            <pre>1> c(t, []).           </pre>
            <pre>{ok,t}</pre>
            <pre>2> t:bench(one).       </pre>
            <pre>15957304</pre>
            <pre>3> t:bench(union).</pre>
            <pre>559470</pre>
            <pre>4> c(t, [native]).     </pre>
            <pre>{ok,t}</pre>
            <pre>5> t:bench(one).  </pre>
            <pre>3611371</pre>
            <pre>6> t:bench(union).</pre>
            <pre>501871</pre>
            <div><br>
            </div>
            <div>I've attached the source code I used for this test, am
              I missing something?</div>
            <div><br>
            </div>
            <div>Regards,</div>
            <div>John Högberg</div>
            <div><br>
            </div>
            <div>On Wed, 2019-03-27 at 10:09 +0100, Oliver Bollmann
              wrote:</div>
            <blockquote type="cite" style="margin:0 0 0 .8ex;
              border-left:2px #729fcf solid;padding-left:1ex">
              <p>I use, with binaries like <<1:1000000>>,</p>
              <pre>one(<span style="color:#660e7a;">F</span>,<<<span style="color:#660e7a;">Size</span>:<span style="color:#0000ff;">64</span>/unsigned, <span style="color:#660e7a;">Bitmap</span>:<span style="color:#660e7a;">Size</span>/bitstring, <span style="color:#660e7a;">_</span>/bitstring>><span style="color:#660e7a;"></span>) -></pre>
              <pre>  one(<span style="color:#660e7a;">F</span>,<span style="color:#660e7a;">Bitmap</span>,<span style="color:#0000ff;">0</span>,[])<span style="color:#000080;font-weight:bold;">.</span></pre>
              <pre>one(<span style="color:#660e7a;">F</span>, <<<span style="color:#0000ff;">0</span>:<span style="color:#0000ff;">1</span>, <span style="color:#660e7a;">R</span>/bitstring>>, <span style="color:#660e7a;">N</span>, <span style="color:#660e7a;">Acc</span>) -></pre>
              <pre>  one(<span style="color:#660e7a;">F</span>, <span style="color:#660e7a;">R</span>, <span style="color:#660e7a;">N </span>+ <span style="color:#0000ff;">1</span>, <span style="color:#660e7a;">Acc</span>);</pre>
              <pre>one(<span style="color:#660e7a;">F</span>, <<<span style="color:#0000ff;">1</span>:<span style="color:#0000ff;">1</span>, <span style="color:#660e7a;">R</span>/bitstring>>, <span style="color:#660e7a;">N</span>, <span style="color:#660e7a;">Acc</span>) -></pre>
              <pre>  one(<span style="color:#660e7a;">F</span>, <span style="color:#660e7a;">R</span>, <span style="color:#660e7a;">N </span>+ <span style="color:#0000ff;">1</span>, [<span style="color:#660e7a;">F</span>(<span style="color:#660e7a;">N</span>) | <span style="color:#660e7a;">Acc</span>]);</pre>
              <pre>one(<span style="color:#660e7a;">_</span>, <<>>, <span style="color:#660e7a;">_</span>, <span style="color:#660e7a;">Acc</span>) -> <span style="color:#660e7a;">Acc</span><span style="color:#000080;font-weight:bold;">.</span></pre>
              <pre><span style="color:#000080;font-weight:bold;"></span></pre>
              <pre>union(<<<span style="color:#660e7a;">Size</span>:<span style="color:#0000ff;">64</span>/unsigned, <span style="color:#660e7a;">L</span>:<span style="color:#660e7a;">Size</span>/unsigned, <span style="color:#660e7a;">P</span>/bitstring>>,</pre>
              <pre>    <<<span style="color:#660e7a;">Size</span>:<span style="color:#0000ff;">64</span>/unsigned, <span style="color:#660e7a;">R</span>:<span style="color:#660e7a;">Size</span>/unsigned, <span style="color:#660e7a;">_</span>/bitstring>>) -></pre>
              <pre>  <<<span style="color:#660e7a;">Size</span>:<span style="color:#0000ff;">64</span>/unsigned, (<span style="color:#660e7a;">L </span><span style="color:#000080;font-weight:bold;">bor </span><span style="color:#660e7a;">R</span>):<span style="color:#660e7a;">Size</span>/unsigned, <span style="color:#660e7a;">P</span>/bitstring>><span style="color:#000080;font-weight:bold;">.</span></pre>
              <p>and call this functions 1,000,000 times, this takes for
                1,000 calls about 20 minutes, <br>
              </p>
              <p>if i compile with native -compile([native,{hipe, o2}])<span
                  style="color:#000080;font-weight:bold;"> </span>it
                takes 3 seconds for 1,000 calls, so it is about 400x
                faster !!</p>
              <p>OS: OSX<br>
              </p>
              <p>What is the secret?<span
                  style="color:#000080;font-weight:bold;"></span></p>
              <pre>-- </pre>
              <pre>Grüße</pre>
              <pre>Oliver Bollmann</pre>
              <pre>_______________________________________________</pre>
              <pre>erlang-questions mailing list</pre>
              <pre><a href="mailto:erlang-questions@erlang.org" moz-do-not-send="true">erlang-questions@erlang.org</a></pre>
              <pre><a href="http://erlang.org/mailman/listinfo/erlang-questions" moz-do-not-send="true">http://erlang.org/mailman/listinfo/erlang-questions</a></pre>
            </blockquote>
            <br>
            <pre>_______________________________________________</pre>
            <pre>erlang-questions mailing list</pre>
            <pre><a class="moz-txt-link-abbreviated" href="mailto:erlang-questions@erlang.org" moz-do-not-send="true">erlang-questions@erlang.org</a></pre>
            <pre><a class="moz-txt-link-freetext" href="http://erlang.org/mailman/listinfo/erlang-questions" moz-do-not-send="true">http://erlang.org/mailman/listinfo/erlang-questions</a></pre>
            <pre>
</pre>
          </blockquote>
          <pre>-- </pre>
          <pre>Grüße</pre>
          <pre>Oliver Bollmann</pre>
        </blockquote>
      </blockquote>
    </blockquote>
    <pre class="moz-signature" cols="72">-- 
Grüße
Oliver Bollmann</pre>
  </body>
</html>