[erlang-questions] Memory usage increase in OTP-18.2.1

Fred Hebert mononcqc@REDACTED
Fri May 20 15:32:46 CEST 2016


On 05/20, Park, Sungjin wrote:
>I found that the memory usage's increased by about 20% when I upgraded
>erlang version from otp-r15b03 to otp-18.2.1.  The green line's otp-18.2.1
>and the blue line's otp-r15b03 in the attached graph.  The vertical axis is
>residential memory used by beam in percentage. The green line starts a bit
>higher than the blue.  This is ok but the gap increases as time goes by.
>

Well, that's gonna be hard to pinpoint. What you're doing is going over 
3 years of development and see that things changed. There's a lot of 
stuff that happened, and without knowing what your reverse proxy does or 
uses as a library it will be quite tricky to just say "oh yeah, for that 
pattern, 18 months ago, patch X caused this to bloat". I'm sure you can 
understand how tricky this is.

First, let's look at the only piece of data we've got, the memory 
figures:

>[{total,           208909800},
>[{total,           250735008},
> {processes,        69473238},
> {processes,       123869144},
> {processes_used,   69388990},
> {processes_used,  123804656},
> {binary,            1066312},
> {binary,            6331096},
> {code,             22978624},
> {code,             29690005},
>

So it sounds like process memory went up, so did binary memory, and so 
did code memory. The most significant bump seems to come from processes 
though. That's as far as diagnostics can go.

>The system acts as a reverse proxy that receives http requests and 
>forwards
>them to multiple backends.  Nothing very special.
>

There's a lot that can go on in a reverse proxy.

>Well, it's still affordable but I want to know that this is the normal
>price to pay to keep track of the otp pace.  Or can there possibly be
>anything wrong in my code that I have to investigate further?
>

Investigate further. There's so many ways to go forward with this:

- try releases one at a time until you find which one exhibits the most 
  drastic rise and look at the changelog
- run manual garbage collections over the node and see if that fixes it; 
  look in changelogs for changes to how GC works
- did you need to update any libraries or dependencies when moving from 
  R15 to 18? Check what changed there
- Treat the issue as any other memory issue ignoring the version change 
  and see if you can find anything.

Most of the tips I have, I have written down in www.erlang-in-anger.com.  
Hopefully there's stuff in there that can help.

The cost of moving forwards in Erlang is, in my experience, rather cheap 
compared to many platforms and languages, except when you're one of 
these people hitting a corner case hard (see: basho with scheduler 
collapse for a great example), but it is certainly going to be a cheaper 
cost when you manage to amortize it over many releases rather than 
introducing 3 years of change at once.

Regards,
Fred.



More information about the erlang-questions mailing list