[erlang-questions] Help debugging binary memory usage

Paul Oliver <>
Sun Oct 16 10:05:47 CEST 2016


Hey Luca,

Check out https://github.com/ferd/recon and
http://dieswaytoofast.blogspot.com/2012/12/erlang-binaries-and-garbage-collection.html
<http://dieswaytoofast.blogspot.co.nz/2012/12/erlang-binaries-and-garbage-collection.html>

Cheers,
Paul.

On Sun, Oct 16, 2016 at 8:53 PM Luca Spiller <> wrote:

> Hi everyone,
>
> One of our nodes seems to have a memory leak. After a couple of days the
> memory usage gets so high that the OOM killer kills it, and it's restarted.
> It seems to have been going on for a few years, as it works fine the whole
> time so nobody noticed - it just uses up all the memory on the box.
>
> A bit of background: the node is making hundreds of HTTP requests per
> second. There are a thousand or so worker processes responsible for this,
> which make a request, inspect the response headers, and based on these
> start other processes. The process then sleeps for X time (seconds to
> minutes) and does the same again. The response body can be any size, but we
> don't care about that in the application (but I'd assume it gets converted
> to a binary by lhttpc). I should also note that some of the requests are
> made over TLS.
>
>
> https://dl.dropboxusercontent.com/u/21557257/20161016-erl/observer-system.png
>
> This is the output from Observer, as you can see it shows that binaries
> are using 2569 MB of RAM. When the node has been restarted and running for
> a few minutes this is usually < 10 MB. Most of the worker processes (95%+)
> which make the requests are started shortly after the node starts and hang
> around forever.
>
>
> https://dl.dropboxusercontent.com/u/21557257/20161016-erl/observer-processes.png
>
> This is the process list from Observer, sorted by memory, it doesn't
> appear to show anything interesting. The worker processes (XXX:init/1) use
> roughly the same amount of memory after they've been running for a few
> minutes.
>
> As I understand large binaries stick around until the system is under
> 'high memory pressure' before being GCed. In my case the node uses up half
> the swap, and all the RAM - is that not high enough? After that the OOM
> killer jumps in and deals with it forcibly.
>
> So... what can I do to debug this?
>
> Thanks,
>
> Luca Spiller
> _______________________________________________
> erlang-questions mailing list
> 
> http://erlang.org/mailman/listinfo/erlang-questions
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20161016/2889bb80/attachment.html>


More information about the erlang-questions mailing list