<div dir="ltr">Hello,<br><div class="gmail_extra"><br><div class="gmail_quote">On Wed, Nov 23, 2016 at 1:47 PM, Max Lapshin <span dir="ltr"><<a href="mailto:max.lapshin@gmail.com" target="_blank">max.lapshin@gmail.com</a>></span> wrote: </div><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex"><div dir="ltr"><span style="font-size:12.8px">We are running our erlang software flussonic, it captures around 1,5 gbit/s of input via TCP, allocates lot of binaries from 500 bytes to 1500 bytes and then prepares large binaries around 1 megabyte (video blobs). There are produced around 250 such blobs per second.</span> <br></div></blockquote><div> </div><div>Have you verified that these assumptions are correct via <a href="http://ferd.github.io/recon/recon_alloc.html#average_block_sizes-1">http://ferd.github.io/recon/recon_alloc.html#average_block_sizes-1</a>? Make sure to take multiple snapshots of current, as max is not really all that useful for this measurement.</div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div><span class="gmail-m_7968002105309063585gmail-s1">+stbt db +sbwt short +swt very_low +sfwi 20</span> +zebwt short +sub true +MBas aoffcaobf +MBacul 0</div></div></blockquote><div><br></div><div>The carrier oriented allocator strategies (the ones with the longest names, i.e. CARRIERSTRATcBLOCKSTRAT) were specifically introduced to enable carrier migration. So using one of those together with disabling acul makes little sense. You most likely want to run +MBas aobf if you disable carrier migration. </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div><p class="gmail-m_7968002105309063585gmail-p1">I get around 2000 mmaps and munmaps per second and recon_alloc tells that I have 98% of usage.  It looks rather strange, so I tried to play with tunings and switched to:</p></div></div></blockquote><div>There is a mseg cache that can be used to cache mmap:ed segments. By default it is set to something like 10 segments, which seems to be too low for your usecase. You can increase the number of segments cached through the +MMmcs switch. The max value is 30, but I know that some other users have tried to use much higher numbers by changing the code in erts and that has been better for them. <br></div><div><br></div><div>You may want to take a look at the cache hit rates that you get from <a href="http://ferd.github.io/recon/recon_alloc.html#cache_hit_rates-0">http://ferd.github.io/recon/recon_alloc.html#cache_hit_rates-0</a>, to see if your changes have any effect.</div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex"><div dir="ltr"><p class="gmail-m_7968002105309063585gmail-p1"><span class="gmail-m_7968002105309063585gmail-s1">+stbt db +sbwt short +swt very_low +sfwi 20</span> +zebwt short +sub true +MBas aoffcaobf +MBsbct 4096 +MBacul de +Mulmbcs 131071 +Mumbcgs 1 +Musmbcs 4095</p></div></blockquote><div> If it is specifically binaries that you are looking at, I would just change the config for +MB and not +Mu. Also having a smaller smbcs than sbct seems a bit odd, why not just up the smbcs to the same value as lmbcs?<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div><p class="gmail-m_7968002105309063585gmail-p1"></p><p class="gmail-m_7968002105309063585gmail-p1">I'm not quite sure that my settings are sane, but I tried to make very large multiblock areas and try to store my binaries inside large areas (not single block carrier, but multiblock carrier).</p><p class="gmail-m_7968002105309063585gmail-p1">With these settings I get about 50 mmap/munmap per second. Seems that hugepages are not used (frankly speaking I thought to autoenable them).</p></div></div></blockquote><div>If you align the mbcs with the size of transparent huge pages that could be beneficial. On my system they are set to 2 MB, is the 128 MB that you are trying to hit what they are set to on your system?</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div><p class="gmail-m_7968002105309063585gmail-p1">But with these settings I get about 50% of usage and all servers are quickly getting killer by OOM killer.</p></div></div></blockquote><div>This is quite odd, it almost feels like the carrier pool is misbehaving. Have you checked if a large amount of the carriers are in the carrier pool when this happens? Maybe try to lower the usage needed to put them in the pool, i.e. something like "+MBacul 10".</div><div><br></div><div>I assume that you are running a reasonably late version of Erlang/OTP? I remember that we did some bug fixes a while back in regards to the pool.</div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div><p class="gmail-m_7968002105309063585gmail-p1">With these flags I tried to hint allocator to create 128MB large areas and objects smaller than 4 megabytes to put into there areas.</p></div></div></blockquote><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex"><div dir="ltr"><p class="gmail-m_7968002105309063585gmail-p1"></p><p class="gmail-m_7968002105309063585gmail-p1">So my questions are:<br></p><p class="gmail-m_7968002105309063585gmail-p1">1) should I worry about 2000 of mmap/unmap syscalls per second?<br></p></div></blockquote><div>Depends on how many schedulers you have running. I don't have any figures about how many mmaps/scheduler per second is good, but I would say that the  fewer syscalls you do you have the better it is.</div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div><p class="gmail-m_7968002105309063585gmail-p1"></p><p class="gmail-m_7968002105309063585gmail-p1">2) should I try to reduce usage of sbct and increase usage of mbct?</p></div></div></blockquote><div>It's a bit of a tradeoff. Having too large items in the mbc allocations makes it harder for them to find spots to place blocks, while on the other hand the mbc allocators are better at scalability then the sbc allocators.</div><div><br></div><div>So by placing too large blocks in the mbc, you get fragmentation issues. But if you place too many blocks in the sbcs, you get scalability issues instead :)</div><div><br></div><div>In general you want to have the majority of your allocations go to mbcs, what the ratio should be is hard to tell.</div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div><p class="gmail-m_7968002105309063585gmail-p1">3) are my flags to erlang VM compatible with each other?</p></div></div></blockquote><div>Seem to be. </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div><p class="gmail-m_7968002105309063585gmail-p1">4) maybe some other hints?</p></div></div></blockquote><div>Measure and try to really understand what erlang:system_info({allocator,binary_alloc}) is giving you. recon_alloc is a great tool, but it is built with an interface to find the specific problems that we have encountered and it hides information from you. Most of the time I end up writing small scripts that analyze the data in a new way looking for exactly what I want to see over time.</div><div><br></div><div>Also reading the erts_alloc documentation is well worth doing very carefully.</div><div><br></div><div>There is also the possibility to completely disable erts alloc and fallback to malloc, you do that via "+Mea min". Doing that you loose a bunch of nice statistics and scalability features. However more man hours have been spent optimizing them so they are a little bit faster per allocated item allocation.</div><div><br></div><div>Lukas</div></div></div></div>