<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<p>Hi John,</p>
<p><b>problem solved!</b></p>
<p>The secret is: process_flag(min_heap_size,<span
style="color:#0000ff;">1024</span>*<span style="color:#0000ff;">1024</span>*<span
style="color:#0000ff;">10</span>),process_flag(<b>min_bin_vheap_size</b>,<span
style="color:#0000ff;">1024</span>*<span style="color:#0000ff;">1024</span>*<span
style="color:#0000ff;">10</span>*<span style="color:#0000ff;">10</span>),</p>
<p>with this i get <b>without native</b> 1,000,000 steps:</p>
<pre style="background-color:#ffffff;color:#000000;font-family:'Menlo';font-size:18.0pt;">#{gc_major_end => 8,gc_major_start => 8,gc_max_heap_size => 0,gc_minor_end => 85,gc_minor_start => 85}
Performance is 100 time faster, the missing factor 4 comes from hipe itself!
Very nice!
Oliver
</pre>
<div class="moz-cite-prefix">On 28.03.19 06:50, Oliver Bollmann
wrote:<br>
</div>
<blockquote type="cite"
cite="mid:80b6aceb-44d1-3c5b-3825-fa9e35d6f847@t-online.de">
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<p>Hi John,</p>
<p>gc tracer(native) <b>1,000,000</b> steps:<br>
</p>
<pre style="background-color:#ffffff;color:#000000;font-family:'Menlo';font-size:18.0pt;">#{gc_major_end => 36,gc_major_start => 36,gc_max_heap_size => 0,gc_minor_end => 116,gc_minor_start => 116}
gc tracer(<b>NOT</b> native) <b>1,000</b> steps:</pre>
<div class="moz-cite-prefix">
<pre style="background-color:#ffffff;color:#000000;font-family:'Menlo';font-size:18.0pt;">#{gc_major_end => 35,gc_major_start => 35,gc_max_heap_size => 0,gc_minor_end => 1202,gc_minor_start => 1202}
Oliver
</pre>
</div>
<div class="moz-cite-prefix"><br>
</div>
<div class="moz-cite-prefix"><br>
</div>
<div class="moz-cite-prefix">On 27.03.19 17:22, John Högberg
wrote:<br>
</div>
<blockquote type="cite"
cite="mid:5c1fa977cb2df1c976a622ea17c3105254c5de7f.camel@erlang.org">
<meta http-equiv="Content-Type" content="text/html;
charset=UTF-8">
<div>Nevermind, I'm blind, I just noticed "UseProf" now. I may
need more coffee :o)</div>
<div><br>
</div>
<div>It's possible that the native code generates less garbage
on the heap, causing fewer GCs, which will be a lot faster if
your processes have a lot of live data as it won't have to
copy it over and over. Try comparing how many garbage
collections the process has gone through with <font size="3"
face="monospace">process_info(Pid, garbage_collection)</font>,
maybe it will provide some clue.</div>
<div><br>
</div>
<div>/John</div>
<div><br>
</div>
<div>On Wed, 2019-03-27 at 16:31 +0100, John Högberg wrote:</div>
<blockquote type="cite" style="margin:0 0 0 .8ex;
border-left:2px #729fcf solid;padding-left:1ex">
<div>Hi Oliver,</div>
<div><br>
</div>
<div>Have you tried comparing performance without eprof?</div>
<div><br>
</div>
<div>eprof uses tracing to figure out which functions take a
long time to run, which adds considerable overhead to small
functions that are repeated extremely often. HiPE doesn't
support tracing at all, so that overhead simply disappears
when the module is native-compiled.</div>
<div><br>
</div>
<div>Regards,</div>
<div>John Högberg</div>
<div><br>
</div>
<div>On Wed, 2019-03-27 at 16:18 +0100, Oliver Bollmann wrote:</div>
<blockquote type="cite" style="margin:0 0 0 .8ex;
border-left:2px #729fcf solid;padding-left:1ex">
<p>Hi John,</p>
<p>indeed, on standalone the factor is about 3.7 only :-(</p>
<p>Attached the module i used. The code is part of: <a
class="moz-txt-link-freetext"
href="https://gitlab.com/Project-FiFo/DalmatinerDB/bitmap"
moz-do-not-send="true">https://gitlab.com/Project-FiFo/DalmatinerDB/bitmap</a></p>
<p>I wonder, where comes the boost?</p>
<p>Facts: OS OSX 10.14.3(64GB)<br>
Erlang 20.3.18, <br>
the "boost" module use a lot of process
directory (about 10GB, almost of this are binaries!)<br>
</p>
<p>Any hints?</p>
<p>Oliver<br>
</p>
<div class="moz-cite-prefix">On 27.03.19 13:04, John Högberg
wrote:<br>
</div>
<blockquote type="cite"
cite="mid:279725e33f4b420fd95841dbe69378d5919fba50.camel@erlang.org"
style="margin:0 0 0 .8ex; border-left:2px #729fcf
solid;padding-left:1ex">
<div>Hi Oliver,</div>
<div><br>
</div>
<div>I've tried to reproduce this discrepancy on my
machine, but I can only see a modest difference on
latest OTP 21 (the results are in microseconds):</div>
<div><br>
</div>
<pre>Erlang/OTP 21 [erts-10.3.1] [source] [64-bit] [smp:24:24] [ds:24:24:10] [async-threads:1] [hipe]</pre>
<pre>Eshell V10.3.1 (abort with ^G)</pre>
<pre>1> c(t, []). </pre>
<pre>{ok,t}</pre>
<pre>2> t:bench(one). </pre>
<pre>15957304</pre>
<pre>3> t:bench(union).</pre>
<pre>559470</pre>
<pre>4> c(t, [native]). </pre>
<pre>{ok,t}</pre>
<pre>5> t:bench(one). </pre>
<pre>3611371</pre>
<pre>6> t:bench(union).</pre>
<pre>501871</pre>
<div><br>
</div>
<div>I've attached the source code I used for this test,
am I missing something?</div>
<div><br>
</div>
<div>Regards,</div>
<div>John Högberg</div>
<div><br>
</div>
<div>On Wed, 2019-03-27 at 10:09 +0100, Oliver Bollmann
wrote:</div>
<blockquote type="cite" style="margin:0 0 0 .8ex;
border-left:2px #729fcf solid;padding-left:1ex">
<p>I use, with binaries like <<1:1000000>>,</p>
<pre>one(<span style="color:#660e7a;">F</span>,<<<span style="color:#660e7a;">Size</span>:<span style="color:#0000ff;">64</span>/unsigned, <span style="color:#660e7a;">Bitmap</span>:<span style="color:#660e7a;">Size</span>/bitstring, <span style="color:#660e7a;">_</span>/bitstring>><span style="color:#660e7a;"></span>) -></pre>
<pre> one(<span style="color:#660e7a;">F</span>,<span style="color:#660e7a;">Bitmap</span>,<span style="color:#0000ff;">0</span>,[])<span style="color:#000080;font-weight:bold;">.</span></pre>
<pre>one(<span style="color:#660e7a;">F</span>, <<<span style="color:#0000ff;">0</span>:<span style="color:#0000ff;">1</span>, <span style="color:#660e7a;">R</span>/bitstring>>, <span style="color:#660e7a;">N</span>, <span style="color:#660e7a;">Acc</span>) -></pre>
<pre> one(<span style="color:#660e7a;">F</span>, <span style="color:#660e7a;">R</span>, <span style="color:#660e7a;">N </span>+ <span style="color:#0000ff;">1</span>, <span style="color:#660e7a;">Acc</span>);</pre>
<pre>one(<span style="color:#660e7a;">F</span>, <<<span style="color:#0000ff;">1</span>:<span style="color:#0000ff;">1</span>, <span style="color:#660e7a;">R</span>/bitstring>>, <span style="color:#660e7a;">N</span>, <span style="color:#660e7a;">Acc</span>) -></pre>
<pre> one(<span style="color:#660e7a;">F</span>, <span style="color:#660e7a;">R</span>, <span style="color:#660e7a;">N </span>+ <span style="color:#0000ff;">1</span>, [<span style="color:#660e7a;">F</span>(<span style="color:#660e7a;">N</span>) | <span style="color:#660e7a;">Acc</span>]);</pre>
<pre>one(<span style="color:#660e7a;">_</span>, <<>>, <span style="color:#660e7a;">_</span>, <span style="color:#660e7a;">Acc</span>) -> <span style="color:#660e7a;">Acc</span><span style="color:#000080;font-weight:bold;">.</span></pre>
<pre><span style="color:#000080;font-weight:bold;"></span></pre>
<pre>union(<<<span style="color:#660e7a;">Size</span>:<span style="color:#0000ff;">64</span>/unsigned, <span style="color:#660e7a;">L</span>:<span style="color:#660e7a;">Size</span>/unsigned, <span style="color:#660e7a;">P</span>/bitstring>>,</pre>
<pre> <<<span style="color:#660e7a;">Size</span>:<span style="color:#0000ff;">64</span>/unsigned, <span style="color:#660e7a;">R</span>:<span style="color:#660e7a;">Size</span>/unsigned, <span style="color:#660e7a;">_</span>/bitstring>>) -></pre>
<pre> <<<span style="color:#660e7a;">Size</span>:<span style="color:#0000ff;">64</span>/unsigned, (<span style="color:#660e7a;">L </span><span style="color:#000080;font-weight:bold;">bor </span><span style="color:#660e7a;">R</span>):<span style="color:#660e7a;">Size</span>/unsigned, <span style="color:#660e7a;">P</span>/bitstring>><span style="color:#000080;font-weight:bold;">.</span></pre>
<p>and call this functions 1,000,000 times, this takes
for 1,000 calls about 20 minutes, <br>
</p>
<p>if i compile with native -compile([native,{hipe,
o2}])<span style="color:#000080;font-weight:bold;"> </span>it
takes 3 seconds for 1,000 calls, so it is about 400x
faster !!</p>
<p>OS: OSX<br>
</p>
<p>What is the secret?<span
style="color:#000080;font-weight:bold;"></span></p>
<pre>-- </pre>
<pre>Grüße</pre>
<pre>Oliver Bollmann</pre>
<pre>_______________________________________________</pre>
<pre>erlang-questions mailing list</pre>
<pre><a href="mailto:erlang-questions@erlang.org" moz-do-not-send="true">erlang-questions@erlang.org</a></pre>
<pre><a href="http://erlang.org/mailman/listinfo/erlang-questions" moz-do-not-send="true">http://erlang.org/mailman/listinfo/erlang-questions</a></pre>
</blockquote>
<br>
<pre>_______________________________________________</pre>
<pre>erlang-questions mailing list</pre>
<pre><a class="moz-txt-link-abbreviated" href="mailto:erlang-questions@erlang.org" moz-do-not-send="true">erlang-questions@erlang.org</a></pre>
<pre><a class="moz-txt-link-freetext" href="http://erlang.org/mailman/listinfo/erlang-questions" moz-do-not-send="true">http://erlang.org/mailman/listinfo/erlang-questions</a></pre>
</blockquote>
<pre>-- </pre>
<pre>Grüße</pre>
<pre>Oliver Bollmann</pre>
</blockquote>
</blockquote>
</blockquote>
<pre class="moz-signature" cols="72">--
Grüße
Oliver Bollmann</pre>
<br>
<fieldset class="mimeAttachmentHeader"></fieldset>
<pre class="moz-quote-pre" wrap="">_______________________________________________
erlang-questions mailing list
<a class="moz-txt-link-abbreviated" href="mailto:erlang-questions@erlang.org">erlang-questions@erlang.org</a>
<a class="moz-txt-link-freetext" href="http://erlang.org/mailman/listinfo/erlang-questions">http://erlang.org/mailman/listinfo/erlang-questions</a>
</pre>
</blockquote>
<pre class="moz-signature" cols="72">--
Grüße
Oliver Bollmann</pre>
</body>
</html>