<html xmlns="http://www.w3.org/1999/xhtml" xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office"><head><!--[if gte mso 9]><xml><o:OfficeDocumentSettings><o:AllowPNG/><o:PixelsPerInch>96</o:PixelsPerInch></o:OfficeDocumentSettings></xml><![endif]--></head><body><div style="color:#000; background-color:#fff; font-family:HelveticaNeue, Helvetica Neue, Helvetica, Arial, Lucida Grande, sans-serif;font-size:16px"><div id="yui_3_16_0_ym19_1_1476216801148_363020" dir="ltr"><span id="yui_3_16_0_ym19_1_1476216801148_363019">Seems you have been bitten by a topic I recently discussed.  This is a common pitfall with Erlang's share nothing heap.  Look at the recent command line args added <a href="http://erlang.org/doc/man/erl.html">http://erlang.org/doc/man/erl.html</a> specifically hmax.<br><br>For more details check out the readme <a href="https://github.com/vans163/stargate" id="yui_3_16_0_ym19_1_1476216801148_363247">https://github.com/vans163/stargate</a>, specifically the Websocket Example section.<br><br>Tl;Dr The most likely reason is that you have a long living process that is processing large binaries, large binaries fragment the shared process heap beyond GC cleanup. Only solution is to kill the long living process from time to time.</span></div> <div class="qtdSeparateBR"><br><br></div><div class="yahoo_quoted" style="display: block;"> <div style="font-family: HelveticaNeue, Helvetica Neue, Helvetica, Arial, Lucida Grande, sans-serif; font-size: 16px;"> <div style="font-family: HelveticaNeue, Helvetica Neue, Helvetica, Arial, Lucida Grande, sans-serif; font-size: 16px;"> <div dir="ltr"><font size="2" face="Arial"> On Sunday, October 16, 2016 4:13 PM, Michael Martin <mmartin4242@gmail.com> wrote:<br></font></div>  <br><br> <div class="y_msg_container"><div id="yiv6730910856"><div>
    <div>Possible message leak? Check for unhandled messages, and log
      them. See the section on unhandled messages <a rel="nofollow" shape="rect" target="_blank" href="https://www.safaribooksonline.com/library/view/designing-for-scalability/9781449361556/ch04.html">here</a>.<br clear="none">
    </div>
    <br clear="none">
    <div class="yiv6730910856yqt5025129148" id="yiv6730910856yqtfd18572"><div class="yiv6730910856moz-cite-prefix">On 10/16/2016 03:05 AM, Paul Oliver
      wrote:<br clear="none">
    </div>
    <blockquote type="cite">
      <div dir="ltr">Hey Luca,
        <div><br clear="none">
        </div>
        <div>Check out <a rel="nofollow" shape="rect" target="_blank" href="https://github.com/ferd/recon">https://github.com/ferd/recon</a> and <a rel="nofollow" shape="rect" target="_blank" href="http://dieswaytoofast.blogspot.co.nz/2012/12/erlang-binaries-and-garbage-collection.html">http://dieswaytoofast.blogspot.com/2012/12/erlang-binaries-and-garbage-collection.html</a></div>
        <div><br clear="none">
        </div>
        <div>Cheers,</div>
        <div>Paul.</div>
      </div>
      <br clear="none">
      <div class="yiv6730910856gmail_quote">
        <div dir="ltr">On Sun, Oct 16, 2016 at 8:53 PM Luca Spiller <<a rel="nofollow" shape="rect" ymailto="mailto:luca@stackednotion.com" target="_blank" href="mailto:luca@stackednotion.com">luca@stackednotion.com</a>>
          wrote:<br clear="none">
        </div>
        <blockquote class="yiv6730910856gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">
          <div class="yiv6730910856gmail_msg" dir="ltr">Hi everyone,
            <div class="yiv6730910856gmail_msg"><br clear="none" class="yiv6730910856gmail_msg">
            </div>
            <div class="yiv6730910856gmail_msg">One of our nodes seems to have a
              memory leak. After a couple of days the memory usage gets
              so high that the OOM killer kills it, and it's restarted.
              It seems to have been going on for a few years, as it
              works fine the whole time so nobody noticed - it just uses
              up all the memory on the box.</div>
            <div class="yiv6730910856gmail_msg"><br clear="none" class="yiv6730910856gmail_msg">
            </div>
            <div class="yiv6730910856gmail_msg">A bit of background: the node is
              making hundreds of HTTP requests per second. There are a
              thousand or so worker processes responsible for this,
              which make a request, inspect the response headers, and
              based on these start other processes. The process then
              sleeps for X time (seconds to minutes) and does the same
              again. The response body can be any size, but we don't
              care about that in the application (but I'd assume it gets
              converted to a binary by lhttpc). I should also note that
              some of the requests are made over TLS.</div>
            <div class="yiv6730910856gmail_msg"><br clear="none" class="yiv6730910856gmail_msg">
            </div>
            <div class="yiv6730910856gmail_msg"><a rel="nofollow" shape="rect" class="yiv6730910856gmail_msg" target="_blank" href="https://dl.dropboxusercontent.com/u/21557257/20161016-erl/observer-system.png">https://dl.dropboxusercontent.com/u/21557257/20161016-erl/observer-system.png</a><br clear="none" class="yiv6730910856gmail_msg">
            </div>
            <div class="yiv6730910856gmail_msg"><br clear="none" class="yiv6730910856gmail_msg">
            </div>
            <div class="yiv6730910856gmail_msg">This is the output from Observer, as
              you can see it shows that binaries are using 2569 MB of
              RAM. When the node has been restarted and running for a
              few minutes this is usually < 10 MB. Most of the worker
              processes (95%+) which make the requests are started
              shortly after the node starts and hang around forever.</div>
            <div class="yiv6730910856gmail_msg"><br clear="none" class="yiv6730910856gmail_msg">
            </div>
            <div class="yiv6730910856gmail_msg"><a rel="nofollow" shape="rect" class="yiv6730910856gmail_msg" target="_blank" href="https://dl.dropboxusercontent.com/u/21557257/20161016-erl/observer-processes.png">https://dl.dropboxusercontent.com/u/21557257/20161016-erl/observer-processes.png</a><br clear="none" class="yiv6730910856gmail_msg">
            </div>
            <div class="yiv6730910856gmail_msg"><br clear="none" class="yiv6730910856gmail_msg">
            </div>
            <div class="yiv6730910856gmail_msg">This is the process list from
              Observer, sorted by memory, it doesn't appear to show
              anything interesting. The worker processes (XXX:init/1)
              use roughly the same amount of memory after they've been
              running for a few minutes.</div>
            <div class="yiv6730910856gmail_msg"><br clear="none" class="yiv6730910856gmail_msg">
            </div>
            <div class="yiv6730910856gmail_msg">As I understand large binaries stick
              around until the system is under 'high memory pressure'
              before being GCed. In my case the node uses up half the
              swap, and all the RAM - is that not high enough? After
              that the OOM killer jumps in and deals with it forcibly.</div>
            <div class="yiv6730910856gmail_msg"><br clear="none" class="yiv6730910856gmail_msg">
            </div>
            <div class="yiv6730910856gmail_msg">So... what can I do to debug this?<br clear="none" class="yiv6730910856gmail_msg">
            </div>
            <div class="yiv6730910856gmail_msg"><br clear="none" class="yiv6730910856gmail_msg">
            </div>
            <div class="yiv6730910856gmail_msg">Thanks,</div>
            <div class="yiv6730910856gmail_msg"><br clear="none" class="yiv6730910856gmail_msg">
            </div>
            <div class="yiv6730910856gmail_msg">Luca Spiller</div>
          </div>
          _______________________________________________<br clear="none" class="yiv6730910856gmail_msg">
          erlang-questions mailing list<br clear="none" class="yiv6730910856gmail_msg">
          <a rel="nofollow" shape="rect" class="yiv6730910856gmail_msg" ymailto="mailto:erlang-questions@erlang.org" target="_blank" href="mailto:erlang-questions@erlang.org">erlang-questions@erlang.org</a><br clear="none" class="yiv6730910856gmail_msg">
          <a rel="nofollow" shape="rect" class="yiv6730910856gmail_msg" target="_blank" href="http://erlang.org/mailman/listinfo/erlang-questions">http://erlang.org/mailman/listinfo/erlang-questions</a><br clear="none" class="yiv6730910856gmail_msg">
        </blockquote>
      </div>
      <br clear="none">
      <fieldset class="yiv6730910856mimeAttachmentHeader"></fieldset>
      <br clear="none">
      <pre>_______________________________________________
erlang-questions mailing list
<a rel="nofollow" shape="rect" class="yiv6730910856moz-txt-link-abbreviated" ymailto="mailto:erlang-questions@erlang.org" target="_blank" href="mailto:erlang-questions@erlang.org">erlang-questions@erlang.org</a>
<a rel="nofollow" shape="rect" class="yiv6730910856moz-txt-link-freetext" target="_blank" href="http://erlang.org/mailman/listinfo/erlang-questions">http://erlang.org/mailman/listinfo/erlang-questions</a>
</pre>
    </blockquote>
    <br clear="none">
  </div></div></div><br><div class="yqt5025129148" id="yqtfd76669">_______________________________________________<br clear="none">erlang-questions mailing list<br clear="none"><a shape="rect" ymailto="mailto:erlang-questions@erlang.org" href="mailto:erlang-questions@erlang.org">erlang-questions@erlang.org</a><br clear="none"><a shape="rect" href="http://erlang.org/mailman/listinfo/erlang-questions" target="_blank">http://erlang.org/mailman/listinfo/erlang-questions</a><br clear="none"></div><br><br></div>  </div> </div>  </div></div></body></html>