<div dir="ltr">Hey Luca,<div><br></div><div>Check out <a href="https://github.com/ferd/recon">https://github.com/ferd/recon</a> and <a href="http://dieswaytoofast.blogspot.co.nz/2012/12/erlang-binaries-and-garbage-collection.html">http://dieswaytoofast.blogspot.com/2012/12/erlang-binaries-and-garbage-collection.html</a></div><div><br></div><div>Cheers,</div><div>Paul.</div></div><br><div class="gmail_quote"><div dir="ltr">On Sun, Oct 16, 2016 at 8:53 PM Luca Spiller <<a href="mailto:luca@stackednotion.com">luca@stackednotion.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr" class="gmail_msg">Hi everyone,<div class="gmail_msg"><br class="gmail_msg"></div><div class="gmail_msg">One of our nodes seems to have a memory leak. After a couple of days the memory usage gets so high that the OOM killer kills it, and it's restarted. It seems to have been going on for a few years, as it works fine the whole time so nobody noticed - it just uses up all the memory on the box.</div><div class="gmail_msg"><br class="gmail_msg"></div><div class="gmail_msg">A bit of background: the node is making hundreds of HTTP requests per second. There are a thousand or so worker processes responsible for this, which make a request, inspect the response headers, and based on these start other processes. The process then sleeps for X time (seconds to minutes) and does the same again. The response body can be any size, but we don't care about that in the application (but I'd assume it gets converted to a binary by lhttpc). I should also note that some of the requests are made over TLS.</div><div class="gmail_msg"><br class="gmail_msg"></div><div class="gmail_msg"><a href="https://dl.dropboxusercontent.com/u/21557257/20161016-erl/observer-system.png" class="gmail_msg" target="_blank">https://dl.dropboxusercontent.com/u/21557257/20161016-erl/observer-system.png</a><br class="gmail_msg"></div><div class="gmail_msg"><br class="gmail_msg"></div><div class="gmail_msg">This is the output from Observer, as you can see it shows that binaries are using 2569 MB of RAM. When the node has been restarted and running for a few minutes this is usually < 10 MB. Most of the worker processes (95%+) which make the requests are started shortly after the node starts and hang around forever.</div><div class="gmail_msg"><br class="gmail_msg"></div><div class="gmail_msg"><a href="https://dl.dropboxusercontent.com/u/21557257/20161016-erl/observer-processes.png" class="gmail_msg" target="_blank">https://dl.dropboxusercontent.com/u/21557257/20161016-erl/observer-processes.png</a><br class="gmail_msg"></div><div class="gmail_msg"><br class="gmail_msg"></div><div class="gmail_msg">This is the process list from Observer, sorted by memory, it doesn't appear to show anything interesting. The worker processes (XXX:init/1) use roughly the same amount of memory after they've been running for a few minutes.</div><div class="gmail_msg"><br class="gmail_msg"></div><div class="gmail_msg">As I understand large binaries stick around until the system is under 'high memory pressure' before being GCed. In my case the node uses up half the swap, and all the RAM - is that not high enough? After that the OOM killer jumps in and deals with it forcibly.</div><div class="gmail_msg"><br class="gmail_msg"></div><div class="gmail_msg">So... what can I do to debug this?<br class="gmail_msg"></div><div class="gmail_msg"><br class="gmail_msg"></div><div class="gmail_msg">Thanks,</div><div class="gmail_msg"><br class="gmail_msg"></div><div class="gmail_msg">Luca Spiller</div></div>
_______________________________________________<br class="gmail_msg">
erlang-questions mailing list<br class="gmail_msg">
<a href="mailto:erlang-questions@erlang.org" class="gmail_msg" target="_blank">erlang-questions@erlang.org</a><br class="gmail_msg">
<a href="http://erlang.org/mailman/listinfo/erlang-questions" rel="noreferrer" class="gmail_msg" target="_blank">http://erlang.org/mailman/listinfo/erlang-questions</a><br class="gmail_msg">
</blockquote></div>