[erlang-questions] Garbage Collection, BEAM memory and Erlang memory

Roberto Ostinelli roberto@REDACTED
Wed Jan 28 12:47:56 CET 2015


Hi Fred,
Thank you for this.

I was collecting allocator and other data to go down this route. Then I
plotted all of this in a graph, and I'm actually seeing that the problem
does not seem to be fragmentation at all.

This is what I see:

erlang memory and recon allocated:
https://cldup.com/pbYr_YNfXw-3000x3000.png

allocator memory:
https://cldup.com/S9OaQZ84Zw-3000x3000.png

recon usage:
https://cldup.com/J9WpssSmt6-3000x3000.png

I've got my system under controlled load, so the incoming requests are all
the same at any given time.
I'm using the +Muacul 0 option Lukas suggested, so that's why the recon
usage never goes down to the 50% that I experienced before.

As you can see, it looks that my problem comes from eheap memory, allocated
for processes. Binary usage is now definitely stable.

The last allocator info that I've got before the VM blew up where:


*recon_alloc:fragmentation(current)*

[{{eheap_alloc,1},
  [{sbcs_usage,0.9998089938211238},
   {mbcs_usage,0.8436328764998643},
   {sbcs_block_size,11515752},
   {sbcs_carriers_size,11517952},
   {mbcs_block_size,1398905280},
   {mbcs_carriers_size,1658191992}]},
 {{eheap_alloc,6},
  [{sbcs_usage,0.6174983671623794},
   {mbcs_usage,0.8458609176542206},
   {sbcs_block_size,3146416},
   {sbcs_carriers_size,5095424},
   {mbcs_block_size,1402599800},
   {mbcs_carriers_size,1658191992}]},
 {{eheap_alloc,7},
  [{sbcs_usage,0.9996959805427548},
   {mbcs_usage,0.8455184225713416},
   {sbcs_block_size,7997056},
   {sbcs_carriers_size,7999488},
   {mbcs_block_size,1409124600},
   {mbcs_carriers_size,1666580600}]},
 {{eheap_alloc,2},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.8449706854053697},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,1401788136},
   {mbcs_carriers_size,1658978424}]},
 {{eheap_alloc,3},
  [{sbcs_usage,0.997382155987395},
   {mbcs_usage,0.8450586945061064},
   {sbcs_block_size,972296},
   {sbcs_carriers_size,974848},
   {mbcs_block_size,1401269560},
   {mbcs_carriers_size,1658191992}]},
 {{eheap_alloc,5},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.8460061373569331},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,1409937416},
   {mbcs_carriers_size,1666580600}]},
 {{eheap_alloc,8},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.8445225737274885},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,1393296200},
   {mbcs_carriers_size,1649803384}]},
 {{eheap_alloc,4},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.8437420231923147},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,1384930624},
   {mbcs_carriers_size,1641414776}]},
 {{binary_alloc,7},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.6716575320561889},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,99502200},
   {mbcs_carriers_size,148144248}]},
 {{binary_alloc,2},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.6781073538541975},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,100457704},
   {mbcs_carriers_size,148144248}]},
 {{binary_alloc,8},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.6796450983368588},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,100685512},
   {mbcs_carriers_size,148144248}]},
 {{binary_alloc,5},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.6840474562333329},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,101337696},
   {mbcs_carriers_size,148144248}]},
 {{binary_alloc,6},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.7055153695407212},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,98599752},
   {mbcs_carriers_size,139755640}]},
 {{binary_alloc,1},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.7102128257578728},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,99256248},
   {mbcs_carriers_size,139755640}]},
 {{binary_alloc,4},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.7147085012096829},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,99884544},
   {mbcs_carriers_size,139755640}]},
 {{binary_alloc,3},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.7225102042050122},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,102300688},
   {mbcs_carriers_size,141590648}]},
 {{ll_alloc,0},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.8925413646389566},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,211512896},
   {mbcs_carriers_size,236978256}]},
 {{driver_alloc,4},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.8187761906238381},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,25139488},
   {mbcs_carriers_size,30703736}]},
 {{driver_alloc,1},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.8188191821347083},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,25140808},
   {mbcs_carriers_size,30703736}]},
 {{driver_alloc,8},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.8218319751055703},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,25233312},
   {mbcs_carriers_size,30703736}]},
 {{driver_alloc,7},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.8219393236054401},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,25236608},
   {mbcs_carriers_size,30703736}]},
 {{driver_alloc,6},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.8316551445074958},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,25534920},
   {mbcs_carriers_size,30703736}]},
 {{driver_alloc,3},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.8330978353904555},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,25579216},
   {mbcs_carriers_size,30703736}]},
 {{driver_alloc,2},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.8339579261624709},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,25605624},
   {mbcs_carriers_size,30703736}]},
 {{driver_alloc,5},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.844592202069481},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,25932136},
   {mbcs_carriers_size,30703736}]},
 {{fix_alloc,4},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.9433490069813972},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,68531264},
   {mbcs_carriers_size,72646776}]},
 {{fix_alloc,7},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.9573151050777532},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,69545856},
   {mbcs_carriers_size,72646776}]},
 {{fix_alloc,6},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.9603094292856162},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,69763384},
   {mbcs_carriers_size,72646776}]},
 {{fix_alloc,1},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.9669017658815307},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,70242296},
   {mbcs_carriers_size,72646776}]},
 {{fix_alloc,8},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.9680059580345314},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,70322512},
   {mbcs_carriers_size,72646776}]},
 {{fix_alloc,5},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.9682964595703463},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,70343616},
   {mbcs_carriers_size,72646776}]},
 {{fix_alloc,3},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.9716501114929037},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,70587248},
   {mbcs_carriers_size,72646776}]},
 {{fix_alloc,2},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.9716850201308314},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,70589784},
   {mbcs_carriers_size,72646776}]},
 {{std_alloc,7},
  [{sbcs_usage,0.8571690150669643},
   {mbcs_usage,0.7343726386030993},
   {sbcs_block_size,1572912},
   {sbcs_carriers_size,1835008},
   {mbcs_block_size,2526800},
   {mbcs_carriers_size,3440760}]},
 {{sl_alloc,8},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.669268417442658},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,2302792},
   {mbcs_carriers_size,3440760}]},
 {{sl_alloc,4},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.6742580127646217},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,2319960},
   {mbcs_carriers_size,3440760}]},
 {{sl_alloc,2},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.6778409421174392},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,2332288},
   {mbcs_carriers_size,3440760}]},
 {{sl_alloc,3},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.6785849637870703},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,2334848},
   {mbcs_carriers_size,3440760}]},
 {{sl_alloc,5},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.6792173822062567},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,2337024},
   {mbcs_carriers_size,3440760}]},
 {{sl_alloc,1},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.679821899812832},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,2339104},
   {mbcs_carriers_size,3440760}]},
 {{sl_alloc,6},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.6809565328590195},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,2343008},
   {mbcs_carriers_size,3440760}]},
 {{sl_alloc,7},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.6915890675315919},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,2379592},
   {mbcs_carriers_size,3440760}]},
 {{std_alloc,4},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.7155860914449134},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,2462160},
   {mbcs_carriers_size,3440760}]},
 {{std_alloc,1},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.7317499622176495},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,2517776},
   {mbcs_carriers_size,3440760}]},
 {{std_alloc,8},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.7326009370022902},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,2520704},
   {mbcs_carriers_size,3440760}]},
 {{std_alloc,2},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.7368139597065765},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,2535200},
   {mbcs_carriers_size,3440760}]},
 {{std_alloc,3},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.7383717550773666},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,2540560},
   {mbcs_carriers_size,3440760}]},
 {{std_alloc,6},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.7468582522465966},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,2569760},
   {mbcs_carriers_size,3440760}]},
 {{std_alloc,5},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.7473279159255514},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,2571376},
   {mbcs_carriers_size,3440760}]},
 {{ets_alloc,8},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.643050651678168},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,864008},
   {mbcs_carriers_size,1343608}]},
 {{ets_alloc,6},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.6700704372108531},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,900312},
   {mbcs_carriers_size,1343608}]},
 {{ets_alloc,2},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.6724282677685753},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,903480},
   {mbcs_carriers_size,1343608}]},
 {{ets_alloc,1},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.6745300712707873},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,906304},
   {mbcs_carriers_size,1343608}]},
 {{ets_alloc,4},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.6789420723901615},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,912232},
   {mbcs_carriers_size,1343608}]},
 {{ets_alloc,7},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.6872540205178892},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,923400},
   {mbcs_carriers_size,1343608}]},
 {{ets_alloc,5},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.7930825062071676},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,1065592},
   {mbcs_carriers_size,1343608}]},
 {{eheap_alloc,0},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.33664856509447394},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,132416},
   {mbcs_carriers_size,393336}]},
 {{ets_alloc,3},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.8979523789676751},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,1206496},
   {mbcs_carriers_size,1343608}]},
 {{temp_alloc,8},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.0},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,0},
   {mbcs_carriers_size,131192}]},
 {{temp_alloc,7},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.0},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,0},
   {mbcs_carriers_size,131192}]},
 {{temp_alloc,6},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.0},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,0},
   {mbcs_carriers_size,131192}]},
 {{temp_alloc,5},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.0},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,0},
   {mbcs_carriers_size,131192}]},
 {{temp_alloc,4},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.0},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,0},
   {mbcs_carriers_size,131192}]},
 {{temp_alloc,3},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.0},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,0},
   {mbcs_carriers_size,131192}]},
 {{temp_alloc,2},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.0},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,0},
   {mbcs_carriers_size,131192}]},
 {{temp_alloc,1},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.0},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,0},
   {mbcs_carriers_size,131192}]},
 {{temp_alloc,0},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.0},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,0},
   {mbcs_carriers_size,131192}]},
 {{std_alloc,0},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.8726646601046666},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,257464},
   {mbcs_carriers_size,295032}]},
 {{sl_alloc,0},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.0},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,0},
   {mbcs_carriers_size,32888}]},
 {{fix_alloc,0},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.02456823157382632},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,808},
   {mbcs_carriers_size,32888}]},
 {{driver_alloc,0},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.039649720262709805},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,1304},
   {mbcs_carriers_size,32888}]},
 {{binary_alloc,0},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.41303819022135735},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,13584},
   {mbcs_carriers_size,32888}]},
 {{ets_alloc,0},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.4582826562880078},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,15072},
   {mbcs_carriers_size,32888}]}]


and *recon_alloc:fragmentation(max)*

[{{eheap_alloc,1},
  [{sbcs_usage,0.9995836942360554},
   {mbcs_usage,0.8438346046481209},
   {sbcs_block_size,35747288},
   {sbcs_carriers_size,35762176},
   {mbcs_block_size,1399239784},
   {mbcs_carriers_size,1658191992}]},
 {{eheap_alloc,5},
  [{sbcs_usage,0.947959190616155},
   {mbcs_usage,0.8462006049992422},
   {sbcs_block_size,17833888},
   {sbcs_carriers_size,18812928},
   {mbcs_block_size,1410261512},
   {mbcs_carriers_size,1666580600}]},
 {{eheap_alloc,2},
  [{sbcs_usage,0.9996057750948681},
   {mbcs_usage,0.8451859214776624},
   {sbcs_block_size,22658328},
   {sbcs_carriers_size,22667264},
   {mbcs_block_size,1402145208},
   {mbcs_carriers_size,1658978424}]},
 {{eheap_alloc,7},
  [{sbcs_usage,0.9994766588528468},
   {mbcs_usage,0.8459256516006487},
   {sbcs_block_size,24231536},
   {sbcs_carriers_size,24244224},
   {mbcs_block_size,1409803280},
   {mbcs_carriers_size,1666580600}]},
 {{eheap_alloc,3},
  [{sbcs_usage,0.9996129234481453},
   {mbcs_usage,0.8454477688733163},
   {sbcs_block_size,27043608},
   {sbcs_carriers_size,27054080},
   {mbcs_block_size,1401914720},
   {mbcs_carriers_size,1658191992}]},
 {{eheap_alloc,8},
  [{sbcs_usage,0.9996057750948681},
   {mbcs_usage,0.8446861617056788},
   {sbcs_block_size,22658328},
   {sbcs_carriers_size,22667264},
   {mbcs_block_size,1393566088},
   {mbcs_carriers_size,1649803384}]},
 {{eheap_alloc,4},
  [{sbcs_usage,0.9996057750948681},
   {mbcs_usage,0.8443270209723029},
   {sbcs_block_size,22658328},
   {sbcs_carriers_size,22667264},
   {mbcs_block_size,1385890848},
   {mbcs_carriers_size,1641414776}]},
 {{eheap_alloc,6},
  [{sbcs_usage,0.999511325925181},
   {mbcs_usage,0.8462386302490357},
   {sbcs_block_size,25444200},
   {sbcs_carriers_size,25456640},
   {mbcs_block_size,1403226120},
   {mbcs_carriers_size,1658191992}]},
 {{ll_alloc,0},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.9004818737462563},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,213394624},
   {mbcs_carriers_size,236978256}]},
 {{binary_alloc,7},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.8944150703711425},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,132502448},
   {mbcs_carriers_size,148144248}]},
 {{binary_alloc,2},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.8954607538998072},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,132657360},
   {mbcs_carriers_size,148144248}]},
 {{binary_alloc,5},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.8966842101085153},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,132838608},
   {mbcs_carriers_size,148144248}]},
 {{binary_alloc,8},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.8991528310974315},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,133204320},
   {mbcs_carriers_size,148144248}]},
 {{binary_alloc,4},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.9407874487212108},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,131480352},
   {mbcs_carriers_size,139755640}]},
 {{binary_alloc,6},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.9438688842897504},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,131911000},
   {mbcs_carriers_size,139755640}]},
 {{binary_alloc,3},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.9512322593509142},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,134685592},
   {mbcs_carriers_size,141590648}]},
 {{binary_alloc,1},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.9556126965609403},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,133552264},
   {mbcs_carriers_size,139755640}]},
 {{driver_alloc,4},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.8197313838289907},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,25168816},
   {mbcs_carriers_size,30703736}]},
 {{driver_alloc,1},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.8225703868740925},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,25255984},
   {mbcs_carriers_size,30703736}]},
 {{driver_alloc,8},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.825497196823214},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,25345848},
   {mbcs_carriers_size,30703736}]},
 {{driver_alloc,7},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.8260857896902188},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,25363920},
   {mbcs_carriers_size,30703736}]},
 {{driver_alloc,6},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.8351752373066261},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,25643000},
   {mbcs_carriers_size,30703736}]},
 {{driver_alloc,3},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.8361403315870095},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,25672632},
   {mbcs_carriers_size,30703736}]},
 {{driver_alloc,2},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.8387523915656387},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,25752832},
   {mbcs_carriers_size,30703736}]},
 {{driver_alloc,5},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.8461701207957234},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,25980584},
   {mbcs_carriers_size,30703736}]},
 {{fix_alloc,4},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.9439967439160686},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,68578320},
   {mbcs_carriers_size,72646776}]},
 {{fix_alloc,7},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.9576758643769684},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,69572064},
   {mbcs_carriers_size,72646776}]},
 {{fix_alloc,6},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.9603094292856162},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,69763384},
   {mbcs_carriers_size,72646776}]},
 {{fix_alloc,1},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.9676070965626885},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,70293536},
   {mbcs_carriers_size,72646776}]},
 {{fix_alloc,8},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.9680071693752796},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,70322600},
   {mbcs_carriers_size,72646776}]},
 {{fix_alloc,5},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.9685851991559818},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,70364592},
   {mbcs_carriers_size,72646776}]},
 {{fix_alloc,3},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.9718468993035562},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,70601544},
   {mbcs_carriers_size,72646776}]},
 {{fix_alloc,2},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.9719647297217979},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,70610104},
   {mbcs_carriers_size,72646776}]},
 {{std_alloc,4},
  [{sbcs_usage,0.7500457763671875},
   {mbcs_usage,0.7213592345877073},
   {sbcs_block_size,786480},
   {sbcs_carriers_size,1048576},
   {mbcs_block_size,2482024},
   {mbcs_carriers_size,3440760}]},
 {{std_alloc,7},
  [{sbcs_usage,0.8571690150669643},
   {mbcs_usage,0.7376068077982771},
   {sbcs_block_size,1572912},
   {sbcs_carriers_size,1835008},
   {mbcs_block_size,2537928},
   {mbcs_carriers_size,3440760}]},
 {{std_alloc,3},
  [{sbcs_usage,0.80003662109375},
   {mbcs_usage,0.7450307490205652},
   {sbcs_block_size,1048624},
   {sbcs_carriers_size,1310720},
   {mbcs_block_size,2563472},
   {mbcs_carriers_size,3440760}]},
 {{std_alloc,5},
  [{sbcs_usage,0.8333638509114584},
   {mbcs_usage,0.7537404526906846},
   {sbcs_block_size,1310768},
   {sbcs_carriers_size,1572864},
   {mbcs_block_size,2593440},
   {mbcs_carriers_size,3440760}]},
 {{sl_alloc,8},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.6812471663237192},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,2344008},
   {mbcs_carriers_size,3440760}]},
 {{sl_alloc,4},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.6843627570652995},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,2354728},
   {mbcs_carriers_size,3440760}]},
 {{sl_alloc,3},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.6919796789081482},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,2380936},
   {mbcs_carriers_size,3440760}]},
 {{sl_alloc,5},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.7003545728269336},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,2409752},
   {mbcs_carriers_size,3440760}]},
 {{sl_alloc,6},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.7007591346097956},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,2411144},
   {mbcs_carriers_size,3440760}]},
 {{sl_alloc,7},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.7018333158953255},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,2414840},
   {mbcs_carriers_size,3440760}]},
 {{sl_alloc,2},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.7091782048152152},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,2440112},
   {mbcs_carriers_size,3440760}]},
 {{std_alloc,6},
  [{sbcs_usage,0.90003662109375},
   {mbcs_usage,0.7518385472976901},
   {sbcs_block_size,1179696},
   {sbcs_carriers_size,1310720},
   {mbcs_block_size,2586896},
   {mbcs_carriers_size,3440760}]},
 {{std_alloc,8},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.7386042618491263},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,2541360},
   {mbcs_carriers_size,3440760}]},
 {{sl_alloc,1},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.7387228403027238},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,2541768},
   {mbcs_carriers_size,3440760}]},
 {{std_alloc,1},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.7394575617014846},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,2544296},
   {mbcs_carriers_size,3440760}]},
 {{std_alloc,2},
  [{sbcs_usage,0.9938373447204969},
   {mbcs_usage,0.7439193666515538},
   {sbcs_block_size,655392},
   {sbcs_carriers_size,659456},
   {mbcs_block_size,2559648},
   {mbcs_carriers_size,3440760}]},
 {{temp_alloc,3},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.2534939072766849},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,299064},
   {mbcs_carriers_size,1179768}]},
 {{binary_alloc,0},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.39198933022131455},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,526680},
   {mbcs_carriers_size,1343608}]},
 {{ets_alloc,8},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.6432173669701282},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,864232},
   {mbcs_carriers_size,1343608}]},
 {{ets_alloc,6},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.6702371525028133},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,900536},
   {mbcs_carriers_size,1343608}]},
 {{ets_alloc,1},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.6746967865627475},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,906528},
   {mbcs_carriers_size,1343608}]},
 {{ets_alloc,2},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.6770010300623396},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,909624},
   {mbcs_carriers_size,1343608}]},
 {{ets_alloc,4},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.6791087876821216},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,912456},
   {mbcs_carriers_size,1343608}]},
 {{ets_alloc,7},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.6874207358098493},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,923624},
   {mbcs_carriers_size,1343608}]},
 {{ets_alloc,5},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.7939398991372484},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,1066744},
   {mbcs_carriers_size,1343608}]},
 {{eheap_alloc,0},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.3466756157585372},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,136360},
   {mbcs_carriers_size,393336}]},
 {{sl_alloc,0},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.1492448276797093},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,44032},
   {mbcs_carriers_size,295032}]},
 {{driver_alloc,0},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.4211610943897611},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,124256},
   {mbcs_carriers_size,295032}]},
 {{ets_alloc,3},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.8981190942596352},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,1206720},
   {mbcs_carriers_size,1343608}]},
 {{temp_alloc,0},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.07579730471370205},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,9944},
   {mbcs_carriers_size,131192}]},
 {{temp_alloc,8},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.11823891700713458},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,15512},
   {mbcs_carriers_size,131192}]},
 {{temp_alloc,7},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.11823891700713458},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,15512},
   {mbcs_carriers_size,131192}]},
 {{temp_alloc,4},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.11823891700713458},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,15512},
   {mbcs_carriers_size,131192}]},
 {{temp_alloc,2},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.11823891700713458},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,15512},
   {mbcs_carriers_size,131192}]},
 {{temp_alloc,1},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.11823891700713458},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,15512},
   {mbcs_carriers_size,131192}]},
 {{temp_alloc,5},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.12494664308799316},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,16392},
   {mbcs_carriers_size,131192}]},
 {{temp_alloc,6},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.2140984206354046},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,28088},
   {mbcs_carriers_size,131192}]},
 {{fix_alloc,0},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.02627098029676478},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,864},
   {mbcs_carriers_size,32888}]},
 {{std_alloc,0},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.90007863553784},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,265552},
   {mbcs_carriers_size,295032}]},
 {{ets_alloc,0},
  [{sbcs_usage,1.0},
   {mbcs_usage,0.4582826562880078},
   {sbcs_block_size,0},
   {sbcs_carriers_size,0},
   {mbcs_block_size,15072},
   {mbcs_carriers_size,32888}]}]


Both of these look fine to me, unless I've missed a spot.

What I do not understand is why, in a system under stable load, suddenly
the eheap blows up like that, eventually crashing the system (this box only
has 15GB of RAM).

Any suggestions on the steps I could make to debug this?

Best,
r.



On Tue, Jan 27, 2015 at 8:09 PM, Fred Hebert <mononcqc@REDACTED> wrote:

> On 01/27, Roberto Ostinelli wrote:
> >
> > I see consistent total, process and binary usage. Unfortunately the ratio
> > falls:
> >
> > 2> recon_alloc:memory(usage, current).
> > 0.7353255106318904
> >
> > ..after a while:
> >
> > 3> recon_alloc:memory(usage, current).
> > 0.5630988225908702
> >
>
> Yeah that does look like some good indication there's an allocator
> 'leak' (or bad usage). You can possibly look at other recon functions to
> try and figure things are wrong in specific ways (a given allocator is
> worse than others -- if it's allocator 0, that's for NIFs and drivers --
> or other ones)
>
> > Why is the VM so eager on memory if the underlying erlang usage is
> stable?
> >
> > Is there anything I can do? I honestly don't know where else to look.
> >
> >    - Binaries are optimized (checked with +bin_opt_info).
> >    - Erlang reported memory for total, process and binary is linear.
> >    - I'm using some gimmicks like fullsweep_after 10 as a system flag.
> >    - I hibernate the long living TCP connections (which is where the
> >    problem comes from, since I ran tests on short lived connections and
> had no
> >    issues).
> >
> > Any help would be greatly appreciated.
> >
>
> What this looks like from the usage metrics is possibly the need for
> different memory allocation strategy. There'S unfortunately no super
> easy way to do it, but if the problem shows up quickly, that at least
> makes it a lot easier to experiment.
>
> I have covered the topic in Section 7.3 of Erlang in Anger
> (http://erlang-in-anger.com), Memory Fragmentation.
>
> The steps are usually:
>
> 1. Find that you have fragmentation issues (done)
> 2. Find which allocator is to blame
> 3. Take note of what your usage pattern is for that allocator. Is data
>    held for a long time? Only some of it? Does it get released in large
>    blocks? What's the variation in datasize type? Max, min, p95 or p99
>    size?
> 4. Check the different strategies available (p.71-73) and see if one
>    could make sense for your usage.
> 5. Check your average block size (in recon_alloc) and see if you need to
>    tweak yours up or down so more or less data gets to be used in the
>    same operation (and may hold them back if they need to be released)
> 6. Experiment a lot and report back.
>
> If you tend to see lots of holes, doing things like reducing block sizes
> (while making sure your sbcs/mbcs ratio remains low enough) and looking
> for some address-order strategy (rather than best fit) might end up
> helping by reusing already-allocated blocks more, and also reducing how
> much spread there is.
>
> Anyway, that's more or less the extent of the experimenting I've done
> that can be applied in a generic manner.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20150128/2adfc25e/attachment.htm>


More information about the erlang-questions mailing list