[erlang-questions] VM leaking memory
Fred Hebert
mononcqc@REDACTED
Thu Jan 31 23:14:15 CET 2019
On 01/31, Frank Muller wrote:
>After adding a new feature to my app (running non-stop for 5 years), it
>started leaking memory in staging.
>
>Obviously, I’m suspecting this new feature. Command top shows RES going
>from 410m (during startup) to 6.2g in less than 12h.
>
>For stupid security reasons, it will take me weeks to be allowed to share
>collected statistics (from recon, entop) here, but I can share them in
>private if someone is willing to help.
>
I'd recommend checking things like:
- recon_alloc:memory(usage) and see if the ratio is high or very low;
this can point towards memory fragmentation if the ratio is low.
- in case there is fragmentation (or something that looks like it)
recon_alloc:fragmentation(current) will return lists of all the
various allocators and types, which should help point towards which
type of memory is causing issues
- if usage seems high, see recon_alloc:memory(allocated_types) to see if
there's any allocator that's higher than others; ETS, binary, or eheap
will tend to point towards an ETS table, a refc binary leak, or some
process gathering lots of memory
Based on this it might be possible to then orient towards other avenues
without you having to share any numbers.
Quick checks if it's binary memory is to call recon:bin_leak(10), which
will probe all processes for their binary memory usage, run a GC on all
of them, then run a probe again, and give you those that have the
largest gap. This can point to processes that had the most dead memory.
There's an undocumented 'binary_memory' option that recon:info,
recon:proc_count, and recon:proc_window all support -- it's undocumented
because it might be expensive and not always safe to run -- that you can
use to find which processes are holding the most binary memory; after a
call to bin_leak, this can let you know about biggest users.
You can also use proc_count with:
- message_queue_len for large mailboxes
- memory for eheap usage
You can use the same values with proc_window to see who is currently
allocating the most.
If ETS is taking a lot of place, calling ets:i() can show a bunch of
tables with content; you might have a runaway cache table or something
like that.
Regards,
Fred.
More information about the erlang-questions
mailing list