[erlang-questions] Erlang OTP 17.3 memory fragment problem (maybe)

Daniel liudanking@REDACTED
Mon Apr 13 03:13:57 CEST 2015


I am running ejabbed 14.07 with some custom modules on a cluster consisting of 3 servers, each server is equiped with 15.75GB memory. Other server infomation is as follows:
Erlang/OTP version: 17.3.
OS: Ubuntu 12.04 x64
ejabberd start parameters: +K true -smp auto +P 250000

The cluster has about 2000 online users and 2000 MUC rooms everyday.

The problem is that the memory usage increases with time.

I have read this post: http://erlang.org/pipermail/erlang-questions/2014-April/078773.html.  And I think my problem is similar to that one (BTW, Recon is really a nice tool):

From *top*, it shows that beam.smp cost 3.8G memory, which is the same with recon_alloc:memory(allocated).

But the recon_alloc:memory(allocated_types) shows that most of the memory is allocated to  binary (about 82%):

(ejabberd@REDACTED)27> recon_alloc:memory(allocated_types).
[{binary_alloc,3504538080},
 {driver_alloc,17498592},
 {eheap_alloc,460747232},
 {ets_alloc,112394720},
 {fix_alloc,4391392},
 {ll_alloc,36700592},
 {sl_alloc,197088},
 {std_alloc,1769952},
 {temp_alloc,393528}]

Memory usage of erlang VM is only 16.8%:
 (ejabberd@REDACTED)31> recon_alloc:memory(usage).
0.1680850734782954
Then I checked the VM memory fragment with recon_alloc:fragmentation(current):

(ejabberd@REDACTED)30> recon_alloc:fragmentation(current).
[{{binary_alloc,2},
  [{sbcs_usage,0.5666586947278912},
   {mbcs_usage,0.03268338368702292},
   {sbcs_block_size,341192},
   {sbcs_carriers_size,602112},
   {mbcs_block_size,66693536},
   {mbcs_carriers_size,2040594592}]},
 {{binary_alloc,1},
  [{sbcs_usage,0.6408052884615385},
   {mbcs_usage,0.03707295337512568},
   {sbcs_block_size,341216},
   {sbcs_carriers_size,532480},
   {mbcs_block_size,52870816},
   {mbcs_carriers_size,1426129056}]},
 {{eheap_alloc,2},
  [{sbcs_usage,0.966106053866951},
   {mbcs_usage,0.6588223404170993},
   {sbcs_block_size,247722824},
   {sbcs_carriers_size,256413696},
   {mbcs_block_size,66492040},
   {mbcs_carriers_size,100925600}]},
 {{eheap_alloc,1},
  [{sbcs_usage,0.9537586503097174},
   {mbcs_usage,0.8195803978109758},
   {sbcs_block_size,211905456},
   {sbcs_carriers_size,222179328},
   {mbcs_block_size,53497304},
   {mbcs_carriers_size,65274016}]},
 {{ets_alloc,1},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.9124718235517926},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,74211640},
   {mbcs_carriers_size,81330336}]},
 {{ll_alloc,0},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.7071616423963475},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,10381280},
   {mbcs_carriers_size,14680208}]},
 {{ets_alloc,2},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.879908981954333},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,27276024},
   {mbcs_carriers_size,30998688}]},
 {{driver_alloc,2},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.7534514469892656},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,9530112},
   {mbcs_carriers_size,12648608}]},
 {{driver_alloc,1},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.39700285601535695},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,1899376},
   {mbcs_carriers_size,4784288}]},
 {{fix_alloc,1},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.2179459675390966},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,471384},
   {mbcs_carriers_size,2162848}]},
 {{ll_alloc,1},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.8666539121534861},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,5907016},
   {mbcs_carriers_size,6815888}]},
 {{fix_alloc,2},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.7809480832679874},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,1689072},
   {mbcs_carriers_size,2162848}]},
 {{ll_alloc,2},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.9754967214960627},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,14831936},
   {mbcs_carriers_size,15204496}]},
 {{std_alloc,1},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.40772631122199926},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,240552},
   {mbcs_carriers_size,589984}]},
 {{std_alloc,0},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.43574063025437976},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,257080},
   {mbcs_carriers_size,589984}]},
 {{std_alloc,2},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.4444866301459023},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,262240},
   {mbcs_carriers_size,589984}]},
 {{eheap_alloc,0},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.12620470903989264},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,33104},
   {mbcs_carriers_size,262304}]},
 {{temp_alloc,2},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.0},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,0},
   {mbcs_carriers_size,131176}]},
 {{temp_alloc,1},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.0},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,0},
   {mbcs_carriers_size,131176}]},
 {{temp_alloc,0},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.0},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,0},
   {mbcs_carriers_size,...}]},
 {{sl_alloc,0},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.0},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,...},
   {...}]},
 {{binary_alloc,0},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.0017048222113979542},
   {sbcs_block_size,0},
   {sbcs_carriers_size,...},
   {...}|...]},
 {{sl_alloc,1},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.006941061860691671},
   {sbcs_block_size,...},
   {...}|...]},
 {{fix_alloc,0},
  [{sbcs_usage,1.0},{mbcs_usage,...},{...}|...]},
 {{driver_alloc,0},[{sbcs_usage,...},{...}|...]},
 {{sl_alloc,2},[{...}|...]},
 {{ets_alloc,...},[...]}]

It shows that macs_usage of binary_alloc is very low.

So my question is:
Does it means that  I meat a ‘classic' memory fragmentation problem? If YES, how can I free the fragment memory to OS?Because my ejabberd have been killed by Linux OOM for several times. If NO, what cause the memory increase all the time? 

Daniel Liu





More information about the erlang-questions mailing list