[erlang-questions] Possibly memory leak in R17

Tue Apr 29 19:27:41 CEST 2014

Hello,

In the allocator snapshot you sent me it's possible to see that the
apparent memory leak has to do with memory fragmentation.

About a week ago I sent a patch to upstream recon to include the mbcs pool
in the calculation of total used memory. With this patch (that is now part
of the master branch for recon) we can see that the utilization of the
binary allocators is very low.

34> recon_alloc:fragmentation(current).
[{{binary_alloc,10},
  [{sbcs_usage,0.9663911159624771},
   {mbcs_usage,0.00851128256381634},
   {sbcs_block_size,12975432},
   {sbcs_carriers_size,13426688},
   {mbcs_block_size,8594792},
   {mbcs_carriers_size,1009811616}]},
 {{binary_alloc,21},
  [{sbcs_usage,0.9576609472490074},
   {mbcs_usage,0.02275766909139376},
   {sbcs_block_size,69155072},
   {sbcs_carriers_size,72212480},
   {mbcs_block_size,21489512},
   {mbcs_carriers_size,944275616}]},
 {{binary_alloc,4},
  [{sbcs_usage,0.9769352091165413},
   {mbcs_usage,0.013649233045825031},
   {sbcs_block_size,26610152},
   {sbcs_carriers_size,27238400},
   {mbcs_block_size,12115776},
   {mbcs_carriers_size,887652512}]},
 {{binary_alloc,8},
  [{sbcs_usage,0.9620084547199639},
   {mbcs_usage,0.01384863563143117},
   {sbcs_block_size,26172048},
   {sbcs_carriers_size,27205632},
   {mbcs_block_size,10651864},
   {mbcs_carriers_size,769163424}]},
 {{binary_alloc,20},
  [{sbcs_usage,0.9607166148513436},
   {mbcs_usage,0.022427188875552406},
   {sbcs_block_size,34412408},
   {sbcs_carriers_size,35819520},
   {mbcs_block_size,15121920},
   {mbcs_carriers_size,674267296}]},...]

So you have a "classic" memory fragmentation problem. The reason why you
are seeing it in 17.0 and not in R16 is probably because we have changed
the default allocation strategy from "best fit" to "address order first fit
carrier best fit".

To fix this I would first try to change the allocation strategy for
binaries to "address order first fit carrier address order best fit" (+MBas
aoffcaobf) and see if that helps.

If that does not work you could always disable the mbcs pool (+MBacul 0),
which would make the allocators run very similar to how they worked in R16.
But you loose the nice mbcs pool :(

One thing to keep in mind here is that maybe you don't have to fix it at
all? The memory is not lost, it just not used at the moment and cannot be
released to the OS because some binaries are still present in them. So
unless you crash with out of memory I would not change anything.

Lukas

On Tue, Apr 29, 2014 at 5:17 PM, Max Lapshin <max.lapshin@REDACTED> wrote:

> (flussonic@REDACTED)3> recon_alloc:snapshot_get().
> {[{total,2201115696},
>   {processes,140618848},
>   {processes_used,140254568},
>   {system,2060496848},
>   {atom,561761},
>   {atom_used,551696},
>   {binary,1939374344},
>   {code,17258778},
>   {ets,85224424}],
>  [{{sys_alloc,0},
>    [{options,[{e,true},{m,libc},{tt,131072},{tp,0}]}]},
>   {{mseg_alloc,0},
>    [{memkind,[{name,"all memory"},
>               {status,[{cached_segments,0},
>                        {cache_hits,79},
>                        {segments,19,20,20},
>                        {segments_size,35913728,36175872,36175872},
>                        {segments_watermark,19}]},
>               {calls,[{mseg_alloc,0,104},
>                       {mseg_dealloc,0,85},
>                       {mseg_realloc,0,0},
>                       {mseg_create_resize,0,0},
>                       {mseg_create,0,25},
>                       {mseg_destroy,0,7},
>                       {mseg_recreate,0,0},
>                       {mseg_clear_cache,0,0},
>                       {mseg_check_cache,0,6}]}]},
>     {options,[{amcbf,4194304},{rmcbf,20},{mcs,10},{scs,0}]},
>     {version,"0.9"}]},
>
> .....
>
>
>
> (flussonic@REDACTED)6> ets:foldl(fun(T,Sum) -> erlang:external_size(T) +
> Sum end, 0, live_streams_attrs).
> 1749128853
>
> (flussonic@REDACTED)9> lists:sum([ ets:foldl(fun(T,Sum) ->
> erlang:external_size(T) + Sum end, 0, T) || T <- ets:all(), is_atom(T)]).
> 1758913413
>
>
> So, it looks like only 2G are really used. Where could leak the rest?  Is
> it possible that data is leaking or accumulating somewhere in network
> buffers?
>
> Version on R16 doesn't have such problems.
>
>
>
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://erlang.org/mailman/listinfo/erlang-questions
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20140429/1b8e6683/attachment.htm>