<div dir="ltr"><div>Hi, Roberto.</div><div><br></div><div>Looks like you have some sort of multiblock allocator fragmentation.</div><div>As you can see in  `recon_alloc:fragmentation(current).` output, your mbcs_usage ~0.7 is correlating with `recon_alloc:memory(usage, current).`</div><div><br></div><div>So, problem isn't in refcounted binaries, but in multiblock allocator fragmantation.</div><div>Also, note that worst utilized allocators is allocators for erlang process heaps (eheap_alloc), allocators for refc binaries (binary_alloc) are on 2'nd place.</div><div><br></div><div>Erlang MBCS allocate large chunks of memory (carrier) and then return chunks of it (so called blocks) by-demand</div><div><span style="font-size:13px"><font face="monospace, monospace">[ busy1 | busy2 | busy3 | free ....... ]</font></span><br></div><div><span style="font-family:'courier new',monospace;font-size:13px"><br></span></div><div><span style="font-size:13px"><font face="arial, helvetica, sans-serif">When some erlang sybsystem releases used memory (eg, process dies or binary released), this space is marked as free on carrier and pointer to it is placed to binary tree for being reused</font></span></div><div><span style="font-family:'courier new',monospace;font-size:13px"><br></span></div><div><font face="monospace, monospace"><span style="font-size:13px">[ busy1 | busy2 | busy3 | free1 | busy4 | free2 ]</span><br></font></div><div><span style="font-size:13px"><font face="monospace, monospace">         </font></span></div><div><span style="font-size:13px"><font face="monospace, monospace">   *--------------------^               ^</font></span></div><div><span style="font-size:13px"><font face="monospace, monospace">   a   b*-------------------------------|</font></span></div><div><font face="monospace, monospace">    \ /</font></div><div><span style="font-size:13px"><font face="monospace, monospace">free_blocks_tree</font></span></div><div><br></div><div>Whole carrier will be deallocated and released to OS only when all blocks on it are free. There is no such thing like carrier defragmentation. But, AFAIK, erlang may terurn (shrink) to OS free blocks from carrier's tail (eg, free2 from pic). Plus 2 free neighbours will be merged to one (eg, if busy4 will be freed, ti will be merged to single block with free1 and free2).</div><div><br></div><div>As you can see, carrier on my "picture" has some holes. It's memory utilization is ~ `sum(busy) / (sum(free) + sum(busy))`.</div><div><br></div><div>Also, if beam asks free_blocks_tree for a free block of certain size, but it has only bigger one, it splits it to 2 blocks, leading to additional fragmentation.</div><div><br></div><div>There are some strategies to improve this situation: </div><div>1) make less alloc/dealloc operations (you may try to make larger min_heap_size for new processes to reduce GC frequency)</div><div>2) change `free_blocks_tree` sorting order so it will be address-ordered and blocks from tail may be returned to OS faster</div><div>3) tune mbcs_block_size and mbcs_carrier_size to fit better to your most common data's size to reduce fragmentation by helping free blocks reusage.</div><div><br></div><div>Which one fits your needs I don't know. You may consult this doc <a href="http://erlang.org/doc/man/erts_alloc.html">http://erlang.org/doc/man/erts_alloc.html</a> for more details.</div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">Hi Lucas,<br>I'm using 17.4 on ubuntu LTS 14.04. The settings I use are:<br>+K true<br>+P 2000000<br>+Q 1000000<br>Please find here below an additional set of data with the changes I've<br>stated here above (binary:copy/0 and the fullsweep_after 0 in the router).<br>Of the recon library I'm using the following calls:<br>recon_alloc:memory(allocated, current).<br>recon_alloc:memory(used, current).<br>recon_alloc:memory(usage, current).<br><br>All of this data is taken at random intervals of time.<br>- BEAM process RES memory:* 2.732 GB*<br>- Erlang memory:<br>[{total,1.7860298827290535},<br> {processes,1.4158401936292648},<br> {processes_used,1.4157484397292137},<br> {system,0.37018968909978867},<br> {atom,4.000673070549965e-4},<br> {atom_used,3.846092149615288e-4},<br> {binary,0.20867645740509033},<br> {code,0.009268132038414478},<br> {ets,0.004821933805942535}]<br>- recon_alloc:<br>allocated: 3015796080 (2.808 GB)<br>used: 2161850416<br>usage: 0.7187714029859935<br><br>- BEAM process RES memory:* 2.813 GB*<br>- Erlang memory:<br>[{total,2.026990756392479},<br> {processes,1.6270370781421661},<br> {processes_used,1.6269719526171684},<br> {system,0.3999536782503128},<br> {atom,4.000673070549965e-4},<br> {atom_used,3.8593634963035583e-4},<br> {binary,0.23845425993204117},<br> {code,0.009311830624938011},<br> {ets,0.004802137613296509}]<br>- recon_alloc:<br>allocated: 3098895728 (2.886 GB)<br>used: 2176172480<br>usage: 0.7023218278482198<br><br>- BEAM process RES memory:* 3.029 GB*<br>- Erlang memory:<br>[{total,2.351852521300316},<br> {processes,1.9361207410693169},<br> {processes_used,1.9360847249627113},<br> {system,0.415731780230999},<br> {atom,4.000673070549965e-4},<br> {atom_used,3.8593634963035583e-4},<br> {binary,0.2539009377360344},<br> {code,0.009311830624938011},<br> {ets,0.004802137613296509}]<br>- recon_alloc:<br>allocated: 3337524592 (3.108 GB)<br>used: 2525365352<br>usage: 0.7548030747173055<br><br>- BEAM process RES memory:* 3.099 GB*<br>- Erlang memory:<br>[{total,2.0704088881611824},<br> {processes,1.6625376418232918},<br> {processes_used,1.6624245047569275},<br> {system,0.4078712463378906},<br> {atom,4.000673070549965e-4},<br> {atom_used,3.8593634963035583e-4},<br> {binary,0.24636883288621902},<br> {code,0.009311830624938011},<br> {ets,0.004802137613296509}]<br>- recon_alloc:<br>allocated: 3400623472 (3.167 GB)<br>used: 2222575336<br>usage: 0.6552131374790817<br><br>- BEAM process RES memory:* 3.132 GB*<br>- Erlang memory:<br>[{total,2.367126949131489},<br> {processes,1.9388784170150757},<br> {processes_used,1.938723236322403},<br> {system,0.4282485321164131},<br> {atom,4.000673070549965e-4},<br> {atom_used,3.8593634963035583e-4},<br> {binary,0.2667432576417923},<br> {code,0.009311830624938011},<br> {ets,0.004802137613296509}]<br>- recon_alloc:<br>allocated: 3435644272 (3.200 GB)<br>used: 2541469864<br>usage: 0.7397146313173368<br><br>- BEAM process RES memory:* 3.307 GB*<br>- Erlang memory:<br>[{total,2.379016488790512},<br> {processes,1.9780860394239426},<br> {processes_used,1.9779272973537445},<br> {system,0.4009304493665695},<br> {atom,4.000673070549965e-4},<br> {atom_used,3.8593634963035583e-4},<br> {binary,0.23943009227514267},<br> {code,0.009311830624938011},<br> {ets,0.004802137613296509}]<br>- recon_alloc:<br>allocated: 3619329392 (3.371 GB)<br>used: 2554804000<br>usage: 0.704330124232303<br><br>- BEAM process RES memory:* 3.351 GB*<br>- Erlang memory:<br>[{total,2.607522390782833},<br> {processes,2.168950654566288},<br> {processes_used,2.1688189953565598},<br> {system,0.4385717362165451},<br> {atom,4.000673070549965e-4},<br> {atom_used,3.8593634963035583e-4},<br> {binary,0.2771267145872116},<br> {code,0.009311830624938011},<br> {ets,0.004802137613296509}]<br>- recon_alloc:<br>allocated: 3669321072 (3.417 GB)<br>used: 2799919616<br>usage: 0.7629933213977411<br><br>- BEAM process RES memory:* 3.469 GB*<br>- Erlang memory:<br>[{total,2.2596140429377556},<br> {processes,1.8098593652248383},<br> {processes_used,1.8098137602210045},<br> {system,0.44975467771291733},<br> {atom,4.000673070549965e-4},<br> {atom_used,3.86836938560009e-4},<br> {binary,0.2881493419408798},<br> {code,0.009375222958624363},<br> {ets,0.00480380654335022}]<br>- recon_alloc:<br>allocated: 3789014384 (3.528 GB)<br>used: 2425613912<br>usage: 0.6401929098614897<br><br>- BEAM process RES memory:* 3.660 GB*<br>- Erlang memory:<br>[{total,2.692381367087364},<br> {processes,2.2282255738973618},<br> {processes_used,2.228082850575447},<br> {system,0.46415579319000244},<br> {atom,4.000673070549965e-4},<br> {atom_used,3.86836938560009e-4},<br> {binary,0.30247989296913147},<br> {code,0.009375222958624363},<br> {ets,0.00480380654335022}]<br>- recon_alloc:<br>allocated: 3993933168 (3.719 GB)<br>used: 2890714704<br>usage: 0.7233507625681828<br><br>- BEAM process RES memory:* 3.667 GB*<br>- Erlang memory:<br>[{total,2.4165985733270645},<br> {processes,1.9264011159539223},<br> {processes_used,1.9263720959424973},<br> {system,0.49019745737314224},<br> {atom,4.000673070549965e-4},<br> {atom_used,3.86836938560009e-4},<br> {binary,0.3284950777888298},<br> {code,0.009375222958624363},<br> {ets,0.00480380654335022}]<br>- recon_alloc:<br>allocated: 4001830256 (3.727 GB)<br>used: 2594872464<br>usage: 0.6483950197811689<br><br>It looks like the memory allocated has some mismatch with the memory<br>reported by the OS, but maybe this is just a timing issue (since top<br>provides an average during a period of time)?<br>Anyway, the BEAM process keeps on increasing. Also, it looks like memory<br>usage bounces anywhere from 64-72% and I don't know if that's a good figure<br>or not.<br>This is what I get from a check on memory fragmentation:<br>1> recon_alloc:fragmentation(current).<br>[{{eheap_alloc,1},<br>  [{sbcs_usage,1.0},<br>   {mbcs_usage,0.6252789717100151},<br>   {sbcs_block_size,0},<br>   {sbcs_carriers_size,0},<br>   {mbcs_block_size,391670536},<br>   {mbcs_carriers_size,626393264}]},<br> {{eheap_alloc,2},<br>  [{sbcs_usage,0.6168021896258503},<br>   {mbcs_usage,0.6893887270883688},<br>   {sbcs_block_size,371384},<br>   {sbcs_carriers_size,602112},<br>   {mbcs_block_size,370926112},<br>   {mbcs_carriers_size,538050736}]},<br> {{eheap_alloc,3},<br>  [{sbcs_usage,0.9991333400321544},<br>   {mbcs_usage,0.7006580932915004},<br>   {sbcs_block_size,5091008},<br>   {sbcs_carriers_size,5095424},<br>   {mbcs_block_size,324091688},<br>   {mbcs_carriers_size,462553264}]},<br> {{eheap_alloc,4},<br>  [{sbcs_usage,1.0},<br>   {mbcs_usage,0.6924985776876923},<br>   {sbcs_block_size,0},<br>   {sbcs_carriers_size,0},<br>   {mbcs_block_size,305976264},<br>   {mbcs_carriers_size,441843888}]},<br> {{eheap_alloc,5},<br>  [{sbcs_usage,1.0},<br>   {mbcs_usage,0.6397496430493375},<br>   {sbcs_block_size,0},<br>   {sbcs_carriers_size,0},<br>   {mbcs_block_size,207536944},<br>   {mbcs_carriers_size,324403376}]},<br> {{eheap_alloc,8},<br>  [{sbcs_usage,1.0},<br>   {mbcs_usage,0.6660125315617468},<br>   {sbcs_block_size,0},<br>   {sbcs_carriers_size,0},<br>   {mbcs_block_size,166472816},<br>   {mbcs_carriers_size,249954480}]},<br> {{eheap_alloc,6},<br>  [{sbcs_usage,0.9980070153061225},<br>   {mbcs_usage,0.6770963446791575},<br>   {sbcs_block_size,600912},<br>   {sbcs_carriers_size,602112},<br>   {mbcs_block_size,169065768},<br>   {mbcs_carriers_size,249692336}]},<br> {{eheap_alloc,7},<br>  [{sbcs_usage,0.997382155987395},<br>   {mbcs_usage,0.6925225022824623},<br>   {sbcs_block_size,972296},<br>   {sbcs_carriers_size,974848},<br>   {mbcs_block_size,167834424},<br>   {mbcs_carriers_size,242352304}]},<br> {{ll_alloc,0},<br>  [{sbcs_usage,1.0},<br>   {mbcs_usage,0.7387279228243809},<br>   {sbcs_block_size,0},<br>   {sbcs_carriers_size,0},<br>   {mbcs_block_size,112706224},<br>   {mbcs_carriers_size,152567976}]},<br> {{binary_alloc,1},<br>  [{sbcs_usage,1.0},<br>   {mbcs_usage,0.7614033191804973},<br>   {sbcs_block_size,0},<br>   {sbcs_carriers_size,0},<br>   {mbcs_block_size,58507096},<br>   {mbcs_carriers_size,76841136}]},<br> {{binary_alloc,4},<br>  [{sbcs_usage,1.0},<br>   {mbcs_usage,0.7179048085736076},<br>   {sbcs_block_size,0},<br>   {sbcs_carriers_size,0},<br>   {mbcs_block_size,46131288},<br>   {mbcs_carriers_size,64258224}]},<br> {{binary_alloc,7},<br>  [{sbcs_usage,1.0},<br>   {mbcs_usage,0.6128136272843548},<br>   {sbcs_block_size,0},<br>   {sbcs_carriers_size,0},<br>   {mbcs_block_size,27651200},<br>   {mbcs_carriers_size,45121712}]},<br> {{binary_alloc,5},<br>  [{sbcs_usage,1.0},<br>   {mbcs_usage,0.6492102167567332},<br>   {sbcs_block_size,0},<br>   {sbcs_carriers_size,0},<br>   {mbcs_block_size,32186648},<br>   {mbcs_carriers_size,49578160}]},<br> {{binary_alloc,2},<br>  [{sbcs_usage,1.0},<br>   {mbcs_usage,0.7715637758298944},<br>   {sbcs_block_size,0},<br>   {sbcs_carriers_size,0},<br>   {mbcs_block_size,56051664},<br>   {mbcs_carriers_size,72646832}]},<br> {{binary_alloc,3},<br>  [{sbcs_usage,1.0},<br>   {mbcs_usage,0.7514879308127178},<br>   {sbcs_block_size,0},<br>   {sbcs_carriers_size,0},<br>   {mbcs_block_size,49077272},<br>   {mbcs_carriers_size,65306800}]},<br> {{binary_alloc,6},<br>  [{sbcs_usage,1.0},<br>   {mbcs_usage,0.6757064168943689},<br>   {sbcs_block_size,0},<br>   {sbcs_carriers_size,0},<br>   {mbcs_block_size,25706456},<br>   {mbcs_carriers_size,38043824}]},<br> {{binary_alloc,8},<br>  [{sbcs_usage,1.0},<br>   {mbcs_usage,0.7016146506167494},<br>   {sbcs_block_size,0},<br>   {sbcs_carriers_size,0},<br>   {mbcs_block_size,25956408},<br>   {mbcs_carriers_size,36995248}]},<br> {{fix_alloc,4},<br>  [{sbcs_usage,1.0},<br>   {mbcs_usage,0.7409146075507547},<br>   {sbcs_block_size,0},<br>   {sbcs_carriers_size,0},<br>   {mbcs_block_size,22748888},<br>   {mbcs_carriers_size,30703792}]},<br> {{fix_alloc,2},<br>  [{sbcs_usage,1.0},<br>   {mbcs_usage,0.8581522751225302},<br>   {sbcs_block_size,0},<br>   {sbcs_carriers_size,0},<br>   {mbcs_block_size,33547232},<br>   {mbcs_carriers_size,39092400}]},<br> {{fix_alloc,3},<br>  [{sbcs_usage,1.0},<br>   {mbcs_usage,0.8630094940716118},<br>   {sbcs_block_size,0},<br>   {sbcs_carriers_size,0},<br>   {mbcs_block_size,26497664},<br>   {mbcs_carriers_size,...}]},<br> {{driver_alloc,1},<br>  [{sbcs_usage,1.0},<br>   {mbcs_usage,0.7615924083651237},<br>   {sbcs_block_size,0},<br>   {sbcs_carriers_size,0},<br>   {mbcs_block_size,...},<br>   {...}]},<br> {{fix_alloc,1},<br>  [{sbcs_usage,1.0},<br>   {mbcs_usage,0.8947156992151927},<br>   {sbcs_block_size,0},<br>   {sbcs_carriers_size,...},<br>   {...}|...]},<br> {{fix_alloc,5},<br>  [{sbcs_usage,1.0},<br>   {mbcs_usage,0.7166474590927997},<br>   {sbcs_block_size,...},<br>   {...}|...]},<br> {{ll_alloc,7},[{sbcs_usage,1.0},{mbcs_usage,...},{...}|...]},<br> {{driver_alloc,2},[{sbcs_usage,...},{...}|...]},<br> {{driver_alloc,4},[{...}|...]},<br> {{ll_alloc,...},[...]},<br> {{...},...},<br> {...}|...]<br><br>Unless I'm mistaken reading this data, everything looks fine there, with<br>normal usages.<br>What can I do else to address this issue?<br>Thank you for your help.<br>r.</blockquote><div><br></div></div>