[erlang-questions] Garbage Collection, BEAM memory and Erlang memory

Jesper Louis Andersen jesper.louis.andersen@REDACTED
Fri Jan 23 15:26:36 CET 2015


On Fri, Jan 23, 2015 at 3:09 PM, Roberto Ostinelli <roberto@REDACTED>
wrote:

> I'm not storing binaries in a ETS. Also, I don't see anywhere I'm holding
> binary values for too long.


It is only a hunch. Clearly, something is using more memory than you
expect, and I assume you have made calculations which shows that this
memory usage is excessive given the number of processes and the specific
profile of the system. Running 20000 SSL sockets will have some overhead
due to SSL, so it is important to figure out if the memory usage is normal
or unexpected in the first place.

Also, check the operating system. If you are out of memory, and being
killed by the OS, then there should be a log line in the kernels log about
it. It is highly suspicious you get killed suddenly with no message at all.
If Erlang terminates, you can arrange that it crash-dumps so you can go
inspect the system afterwards. Alternatively, you can force it to terminate
with a SIGUSR1 signal which will have it crash-dump. Luckily, you are in
development mode, so being mean to the system is an option :)

Other typical OS limitations are resources such as file descriptors. You
need to verify that you are not being killed by the OS and debug from the
bottom-up. Otherwise you will end up making a wrong conclusion somewhere
along the way. Your goal should be to verify that what you *expect* is
*reality*. So seek those conclusions.

Another common thing I do, is to run mem_sup on the machine and then add in
an alert for processes which take up more than 5% of the memory of the
system. If such a process occurs, you go inspect it with process_info and
figure out if it has a rather small backtrace. In that case, you log the
backtrace fully and if it is too long, you log only the topmost parts of
the call chain. All of this is possible by installing a custom
alarm_handler.

It is yet another of those things I like to have in production systems,
because it quickly finds and reports odd processes that are doing nasty
things to the system.


-- 
J.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20150123/9f5de739/attachment.htm>


More information about the erlang-questions mailing list