[erlang-questions] Garbage Collection, BEAM memory and Erlang memory

Сергей Прохоров seriy.pr@REDACTED
Fri Jan 23 18:29:06 CET 2015


Hi, Roberto.

Looks like you have some sort of multiblock allocator fragmentation.
As you can see in  `recon_alloc:fragmentation(current).` output, your
mbcs_usage ~0.7 is correlating with `recon_alloc:memory(usage, current).`

So, problem isn't in refcounted binaries, but in multiblock allocator
fragmantation.
Also, note that worst utilized allocators is allocators for erlang process
heaps (eheap_alloc), allocators for refc binaries (binary_alloc) are on
2'nd place.

Erlang MBCS allocate large chunks of memory (carrier) and then return
chunks of it (so called blocks) by-demand
[ busy1 | busy2 | busy3 | free ....... ]

When some erlang sybsystem releases used memory (eg, process dies or binary
released), this space is marked as free on carrier and pointer to it is
placed to binary tree for being reused

[ busy1 | busy2 | busy3 | free1 | busy4 | free2 ]

   *--------------------^               ^
   a   b*-------------------------------|
    \ /
free_blocks_tree

Whole carrier will be deallocated and released to OS only when all blocks
on it are free. There is no such thing like carrier defragmentation. But,
AFAIK, erlang may terurn (shrink) to OS free blocks from carrier's tail
(eg, free2 from pic). Plus 2 free neighbours will be merged to one (eg, if
busy4 will be freed, ti will be merged to single block with free1 and
free2).

As you can see, carrier on my "picture" has some holes. It's memory
utilization is ~ `sum(busy) / (sum(free) + sum(busy))`.

Also, if beam asks free_blocks_tree for a free block of certain size, but
it has only bigger one, it splits it to 2 blocks, leading to additional
fragmentation.

There are some strategies to improve this situation:
1) make less alloc/dealloc operations (you may try to make larger
min_heap_size for new processes to reduce GC frequency)
2) change `free_blocks_tree` sorting order so it will be address-ordered
and blocks from tail may be returned to OS faster
3) tune mbcs_block_size and mbcs_carrier_size to fit better to your most
common data's size to reduce fragmentation by helping free blocks reusage.

Which one fits your needs I don't know. You may consult this doc
http://erlang.org/doc/man/erts_alloc.html for more details.

Hi Lucas,
> I'm using 17.4 on ubuntu LTS 14.04. The settings I use are:
> +K true
> +P 2000000
> +Q 1000000
> Please find here below an additional set of data with the changes I've
> stated here above (binary:copy/0 and the fullsweep_after 0 in the router).
> Of the recon library I'm using the following calls:
> recon_alloc:memory(allocated, current).
> recon_alloc:memory(used, current).
> recon_alloc:memory(usage, current).
>
> All of this data is taken at random intervals of time.
> - BEAM process RES memory:* 2.732 GB*
> - Erlang memory:
> [{total,1.7860298827290535},
>  {processes,1.4158401936292648},
>  {processes_used,1.4157484397292137},
>  {system,0.37018968909978867},
>  {atom,4.000673070549965e-4},
>  {atom_used,3.846092149615288e-4},
>  {binary,0.20867645740509033},
>  {code,0.009268132038414478},
>  {ets,0.004821933805942535}]
> - recon_alloc:
> allocated: 3015796080 (2.808 GB)
> used: 2161850416
> usage: 0.7187714029859935
>
> - BEAM process RES memory:* 2.813 GB*
> - Erlang memory:
> [{total,2.026990756392479},
>  {processes,1.6270370781421661},
>  {processes_used,1.6269719526171684},
>  {system,0.3999536782503128},
>  {atom,4.000673070549965e-4},
>  {atom_used,3.8593634963035583e-4},
>  {binary,0.23845425993204117},
>  {code,0.009311830624938011},
>  {ets,0.004802137613296509}]
> - recon_alloc:
> allocated: 3098895728 (2.886 GB)
> used: 2176172480
> usage: 0.7023218278482198
>
> - BEAM process RES memory:* 3.029 GB*
> - Erlang memory:
> [{total,2.351852521300316},
>  {processes,1.9361207410693169},
>  {processes_used,1.9360847249627113},
>  {system,0.415731780230999},
>  {atom,4.000673070549965e-4},
>  {atom_used,3.8593634963035583e-4},
>  {binary,0.2539009377360344},
>  {code,0.009311830624938011},
>  {ets,0.004802137613296509}]
> - recon_alloc:
> allocated: 3337524592 (3.108 GB)
> used: 2525365352
> usage: 0.7548030747173055
>
> - BEAM process RES memory:* 3.099 GB*
> - Erlang memory:
> [{total,2.0704088881611824},
>  {processes,1.6625376418232918},
>  {processes_used,1.6624245047569275},
>  {system,0.4078712463378906},
>  {atom,4.000673070549965e-4},
>  {atom_used,3.8593634963035583e-4},
>  {binary,0.24636883288621902},
>  {code,0.009311830624938011},
>  {ets,0.004802137613296509}]
> - recon_alloc:
> allocated: 3400623472 (3.167 GB)
> used: 2222575336
> usage: 0.6552131374790817
>
> - BEAM process RES memory:* 3.132 GB*
> - Erlang memory:
> [{total,2.367126949131489},
>  {processes,1.9388784170150757},
>  {processes_used,1.938723236322403},
>  {system,0.4282485321164131},
>  {atom,4.000673070549965e-4},
>  {atom_used,3.8593634963035583e-4},
>  {binary,0.2667432576417923},
>  {code,0.009311830624938011},
>  {ets,0.004802137613296509}]
> - recon_alloc:
> allocated: 3435644272 (3.200 GB)
> used: 2541469864
> usage: 0.7397146313173368
>
> - BEAM process RES memory:* 3.307 GB*
> - Erlang memory:
> [{total,2.379016488790512},
>  {processes,1.9780860394239426},
>  {processes_used,1.9779272973537445},
>  {system,0.4009304493665695},
>  {atom,4.000673070549965e-4},
>  {atom_used,3.8593634963035583e-4},
>  {binary,0.23943009227514267},
>  {code,0.009311830624938011},
>  {ets,0.004802137613296509}]
> - recon_alloc:
> allocated: 3619329392 (3.371 GB)
> used: 2554804000
> usage: 0.704330124232303
>
> - BEAM process RES memory:* 3.351 GB*
> - Erlang memory:
> [{total,2.607522390782833},
>  {processes,2.168950654566288},
>  {processes_used,2.1688189953565598},
>  {system,0.4385717362165451},
>  {atom,4.000673070549965e-4},
>  {atom_used,3.8593634963035583e-4},
>  {binary,0.2771267145872116},
>  {code,0.009311830624938011},
>  {ets,0.004802137613296509}]
> - recon_alloc:
> allocated: 3669321072 (3.417 GB)
> used: 2799919616
> usage: 0.7629933213977411
>
> - BEAM process RES memory:* 3.469 GB*
> - Erlang memory:
> [{total,2.2596140429377556},
>  {processes,1.8098593652248383},
>  {processes_used,1.8098137602210045},
>  {system,0.44975467771291733},
>  {atom,4.000673070549965e-4},
>  {atom_used,3.86836938560009e-4},
>  {binary,0.2881493419408798},
>  {code,0.009375222958624363},
>  {ets,0.00480380654335022}]
> - recon_alloc:
> allocated: 3789014384 (3.528 GB)
> used: 2425613912
> usage: 0.6401929098614897
>
> - BEAM process RES memory:* 3.660 GB*
> - Erlang memory:
> [{total,2.692381367087364},
>  {processes,2.2282255738973618},
>  {processes_used,2.228082850575447},
>  {system,0.46415579319000244},
>  {atom,4.000673070549965e-4},
>  {atom_used,3.86836938560009e-4},
>  {binary,0.30247989296913147},
>  {code,0.009375222958624363},
>  {ets,0.00480380654335022}]
> - recon_alloc:
> allocated: 3993933168 (3.719 GB)
> used: 2890714704
> usage: 0.7233507625681828
>
> - BEAM process RES memory:* 3.667 GB*
> - Erlang memory:
> [{total,2.4165985733270645},
>  {processes,1.9264011159539223},
>  {processes_used,1.9263720959424973},
>  {system,0.49019745737314224},
>  {atom,4.000673070549965e-4},
>  {atom_used,3.86836938560009e-4},
>  {binary,0.3284950777888298},
>  {code,0.009375222958624363},
>  {ets,0.00480380654335022}]
> - recon_alloc:
> allocated: 4001830256 (3.727 GB)
> used: 2594872464
> usage: 0.6483950197811689
>
> It looks like the memory allocated has some mismatch with the memory
> reported by the OS, but maybe this is just a timing issue (since top
> provides an average during a period of time)?
> Anyway, the BEAM process keeps on increasing. Also, it looks like memory
> usage bounces anywhere from 64-72% and I don't know if that's a good figure
> or not.
> This is what I get from a check on memory fragmentation:
> 1> recon_alloc:fragmentation(current).
> [{{eheap_alloc,1},
>   [{sbcs_usage,1.0},
>    {mbcs_usage,0.6252789717100151},
>    {sbcs_block_size,0},
>    {sbcs_carriers_size,0},
>    {mbcs_block_size,391670536},
>    {mbcs_carriers_size,626393264}]},
>  {{eheap_alloc,2},
>   [{sbcs_usage,0.6168021896258503},
>    {mbcs_usage,0.6893887270883688},
>    {sbcs_block_size,371384},
>    {sbcs_carriers_size,602112},
>    {mbcs_block_size,370926112},
>    {mbcs_carriers_size,538050736}]},
>  {{eheap_alloc,3},
>   [{sbcs_usage,0.9991333400321544},
>    {mbcs_usage,0.7006580932915004},
>    {sbcs_block_size,5091008},
>    {sbcs_carriers_size,5095424},
>    {mbcs_block_size,324091688},
>    {mbcs_carriers_size,462553264}]},
>  {{eheap_alloc,4},
>   [{sbcs_usage,1.0},
>    {mbcs_usage,0.6924985776876923},
>    {sbcs_block_size,0},
>    {sbcs_carriers_size,0},
>    {mbcs_block_size,305976264},
>    {mbcs_carriers_size,441843888}]},
>  {{eheap_alloc,5},
>   [{sbcs_usage,1.0},
>    {mbcs_usage,0.6397496430493375},
>    {sbcs_block_size,0},
>    {sbcs_carriers_size,0},
>    {mbcs_block_size,207536944},
>    {mbcs_carriers_size,324403376}]},
>  {{eheap_alloc,8},
>   [{sbcs_usage,1.0},
>    {mbcs_usage,0.6660125315617468},
>    {sbcs_block_size,0},
>    {sbcs_carriers_size,0},
>    {mbcs_block_size,166472816},
>    {mbcs_carriers_size,249954480}]},
>  {{eheap_alloc,6},
>   [{sbcs_usage,0.9980070153061225},
>    {mbcs_usage,0.6770963446791575},
>    {sbcs_block_size,600912},
>    {sbcs_carriers_size,602112},
>    {mbcs_block_size,169065768},
>    {mbcs_carriers_size,249692336}]},
>  {{eheap_alloc,7},
>   [{sbcs_usage,0.997382155987395},
>    {mbcs_usage,0.6925225022824623},
>    {sbcs_block_size,972296},
>    {sbcs_carriers_size,974848},
>    {mbcs_block_size,167834424},
>    {mbcs_carriers_size,242352304}]},
>  {{ll_alloc,0},
>   [{sbcs_usage,1.0},
>    {mbcs_usage,0.7387279228243809},
>    {sbcs_block_size,0},
>    {sbcs_carriers_size,0},
>    {mbcs_block_size,112706224},
>    {mbcs_carriers_size,152567976}]},
>  {{binary_alloc,1},
>   [{sbcs_usage,1.0},
>    {mbcs_usage,0.7614033191804973},
>    {sbcs_block_size,0},
>    {sbcs_carriers_size,0},
>    {mbcs_block_size,58507096},
>    {mbcs_carriers_size,76841136}]},
>  {{binary_alloc,4},
>   [{sbcs_usage,1.0},
>    {mbcs_usage,0.7179048085736076},
>    {sbcs_block_size,0},
>    {sbcs_carriers_size,0},
>    {mbcs_block_size,46131288},
>    {mbcs_carriers_size,64258224}]},
>  {{binary_alloc,7},
>   [{sbcs_usage,1.0},
>    {mbcs_usage,0.6128136272843548},
>    {sbcs_block_size,0},
>    {sbcs_carriers_size,0},
>    {mbcs_block_size,27651200},
>    {mbcs_carriers_size,45121712}]},
>  {{binary_alloc,5},
>   [{sbcs_usage,1.0},
>    {mbcs_usage,0.6492102167567332},
>    {sbcs_block_size,0},
>    {sbcs_carriers_size,0},
>    {mbcs_block_size,32186648},
>    {mbcs_carriers_size,49578160}]},
>  {{binary_alloc,2},
>   [{sbcs_usage,1.0},
>    {mbcs_usage,0.7715637758298944},
>    {sbcs_block_size,0},
>    {sbcs_carriers_size,0},
>    {mbcs_block_size,56051664},
>    {mbcs_carriers_size,72646832}]},
>  {{binary_alloc,3},
>   [{sbcs_usage,1.0},
>    {mbcs_usage,0.7514879308127178},
>    {sbcs_block_size,0},
>    {sbcs_carriers_size,0},
>    {mbcs_block_size,49077272},
>    {mbcs_carriers_size,65306800}]},
>  {{binary_alloc,6},
>   [{sbcs_usage,1.0},
>    {mbcs_usage,0.6757064168943689},
>    {sbcs_block_size,0},
>    {sbcs_carriers_size,0},
>    {mbcs_block_size,25706456},
>    {mbcs_carriers_size,38043824}]},
>  {{binary_alloc,8},
>   [{sbcs_usage,1.0},
>    {mbcs_usage,0.7016146506167494},
>    {sbcs_block_size,0},
>    {sbcs_carriers_size,0},
>    {mbcs_block_size,25956408},
>    {mbcs_carriers_size,36995248}]},
>  {{fix_alloc,4},
>   [{sbcs_usage,1.0},
>    {mbcs_usage,0.7409146075507547},
>    {sbcs_block_size,0},
>    {sbcs_carriers_size,0},
>    {mbcs_block_size,22748888},
>    {mbcs_carriers_size,30703792}]},
>  {{fix_alloc,2},
>   [{sbcs_usage,1.0},
>    {mbcs_usage,0.8581522751225302},
>    {sbcs_block_size,0},
>    {sbcs_carriers_size,0},
>    {mbcs_block_size,33547232},
>    {mbcs_carriers_size,39092400}]},
>  {{fix_alloc,3},
>   [{sbcs_usage,1.0},
>    {mbcs_usage,0.8630094940716118},
>    {sbcs_block_size,0},
>    {sbcs_carriers_size,0},
>    {mbcs_block_size,26497664},
>    {mbcs_carriers_size,...}]},
>  {{driver_alloc,1},
>   [{sbcs_usage,1.0},
>    {mbcs_usage,0.7615924083651237},
>    {sbcs_block_size,0},
>    {sbcs_carriers_size,0},
>    {mbcs_block_size,...},
>    {...}]},
>  {{fix_alloc,1},
>   [{sbcs_usage,1.0},
>    {mbcs_usage,0.8947156992151927},
>    {sbcs_block_size,0},
>    {sbcs_carriers_size,...},
>    {...}|...]},
>  {{fix_alloc,5},
>   [{sbcs_usage,1.0},
>    {mbcs_usage,0.7166474590927997},
>    {sbcs_block_size,...},
>    {...}|...]},
>  {{ll_alloc,7},[{sbcs_usage,1.0},{mbcs_usage,...},{...}|...]},
>  {{driver_alloc,2},[{sbcs_usage,...},{...}|...]},
>  {{driver_alloc,4},[{...}|...]},
>  {{ll_alloc,...},[...]},
>  {{...},...},
>  {...}|...]
>
> Unless I'm mistaken reading this data, everything looks fine there, with
> normal usages.
> What can I do else to address this issue?
> Thank you for your help.
> r.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20150123/eda08a36/attachment.htm>


More information about the erlang-questions mailing list