<html>

  <head>

    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">

  </head>

  <body text="#000000" bgcolor="#FFFFFF">

    <pre class="moz-quote-pre" wrap="">Hi,

> I'm glad it worked out!

> However, you're still going to copy those ~2GB of live data when a full

> GC finally happens, and I think you should consider reducing that

> figure. Do you really need all that data in one process?

The problem i solved with this process is to resolve nested groups, using diagraph:in_neighbours.

I have 1M groups with each have at least 100,000 members. The nested level is at least 100. Loops are allowed!

Question: Which group have which members =/= group.

I started with ets, ets copy at each access the value, i got a lot of memory peaks, not good.

I tried lists,maps and so on. 

I finished at the process directory, perfect, if the process died the memory is gone and using binaries only no copy of data on get.

Now i use a bitmap a 1Mx1M grid which each bit is a nested group, using union,intersection to resolve the nested groups.

The process runs now about 10mins and save the result in mnesia(about 5GB) and die.

BTW, the persistent_term looks good, cause the grid is a onetime grid, to split in more than one process, but what i need 10M terms with about 1TB binaries, for the next step :-)

Oliver

<span style="caret-color: rgb(26, 26, 26); color: rgb(26, 26, 26); font-family: sans-serif; font-size: 16px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; background-color: rgb(254, 254, 254); text-decoration: none; display: inline !important; float: none;"></span>

</pre>

    <div class="moz-cite-prefix">On 28.03.19 15:05, John HÃ¶gberg wrote:<br>

    </div>

    <blockquote type="cite"

      cite="mid:a1147aef1cf32f35abf67961544783e08c8dc0d3.camel@erlang.org">

      <pre class="moz-quote-pre" wrap="">On Wed, 2019-03-27 at 21:30 +0100, Kostis Sagonas wrote:

</pre>

      <blockquote type="cite">

        <pre class="moz-quote-pre" wrap="">On the other hand, I would not call the performance difference

between 

BEAM and HiPE that you observed "modest".  Four times faster

execution 

is IMO something that deserves a better adjective.

Kostis

</pre>

      </blockquote>

      <pre class="moz-quote-pre" wrap="">

Yes, it's a very impressive improvement. "Modest" was in relation to

that 400x number and I should've been clearer about that, "reasonable

difference" would have been better wording.

On Thu, 2019-03-28 at 08:34 +0100, Oliver Bollmann wrote:

</pre>

      <blockquote type="cite">

        <pre class="moz-quote-pre" wrap="">Hi John,

problem solved!

The secret is:

process_flag(min_heap_size,1024*1024*10),process_flag(min_bin_vheap_s

ize,1024*1024*10*10),

with this i get without native 1,000,000 steps:

#{gc_major_end => 8,gc_major_start => 8,gc_max_heap_size =>

0,gc_minor_end => 85,gc_minor_start => 85}

Performance is 100 time faster, the missing factor 4 comes from hipe

itself!

Very nice!

Oliver

</pre>

      </blockquote>

      <pre class="moz-quote-pre" wrap="">

I'm glad it worked out!

However, you're still going to copy those ~2GB of live data when a full

GC finally happens, and I think you should consider reducing that

figure. Do you really need all that data in one process?

On Thu, 2019-03-28 at 08:55 +0100, Frank Muller wrote:

</pre>

      <blockquote type="cite">

        <pre class="moz-quote-pre" wrap="">Can someone shed some light on the difference between min_heap_size

& min_bin_vheap_size

on how to tweak them per process to tune VMâ€™s perfs?

Thanks

</pre>

      </blockquote>

      <pre class="moz-quote-pre" wrap="">

On the process heap, off-heap binaries are essentially just a small

chunk with a pointer and size, so if we decided to GC based on the

process heap alone we would keep an unreachable 1GB binary alive for

just as long as a 1KB one (all else equal), which is a bit suboptimal.

We therefore track the combined size of all our off-heap data and GC

when they exceed the "virtual binary heap size," even if the process

heap nowhere near full. This "virtual binary heap" grows and shrinks

much like the ordinary process heap, and the min_bin_vheap_size option

is analogous to min_heap_size.

In general you shouldn't need to play around with these settings, but

if you have a process that you know will grow really fast then there

may be something to gain by bumping its minimum heap size. I don't

recommend doing this without careful consideration though.

<a class="moz-txt-link-freetext" href="http://erlang.org/doc/efficiency_guide/processes.html#initial-heap-size">http://erlang.org/doc/efficiency_guide/processes.html#initial-heap-size</a>

/John

_______________________________________________

erlang-questions mailing list

<a class="moz-txt-link-abbreviated" href="mailto:erlang-questions@erlang.org">erlang-questions@erlang.org</a>

<a class="moz-txt-link-freetext" href="http://erlang.org/mailman/listinfo/erlang-questions">http://erlang.org/mailman/listinfo/erlang-questions</a>

</pre>

    </blockquote>

    <pre class="moz-signature" cols="72">-- 

GrÃ¼ÃŸe

Oliver Bollmann</pre>

  </body>

</html>