[erlang-questions] Erlang Memory Question
Erik Søe Sørensen
Sun Oct 5 23:37:44 CEST 2014
For the short term, I think the option of hibernating the processes should
be mentioned as well - it ensures that dormant session processes don't take
up more memory than necessary.
Den 05/10/2014 23.18 skrev "Jesper Louis Andersen" <
> On Sun, Oct 5, 2014 at 9:27 AM, Eranga Udesh <eranga.erl@REDACTED> wrote:
>> I found temporary variables, eg. binary_to_list of a XML data say 100 KB
>> in size (Xmerl needs string), won't get freed for a long period of time
>> without force garbage collection. Therefore when there are about 500 user
>> sessions, each process consuming large memory blocks, makes the system
>> memory usage extremely high. We plan to support a large number of user
>> sessions, say 10000s and this memory consumption is a show stopper for us
>> at the moment.
> This is your problem in a nutshell. Calling binary_to_list/1 on a 100KB
> binary blows it up to at least 2.4 megabytes in size. When the process is
> done, it takes a bit of time for the heap to shrink down again. It will
> just get into a serious problem when your system is going to process XML
> documents for a large set of users at the same time. You have two general
> options, which should both be applied in a serious system:
> * xmerl is only useful for small configuration blocks of data. If you are
> processing larger amounts of data, you need an XML parser which operates
> directly on the binary representation. In addition, if you can find an XML
> parser which allows you to parse in SAX-style, so you don't have to form an
> intermediate structure will help a lot. In Haskell, particularly GHC,
> fusion optimizations would mostly take of these things, but it doesn't
> exist in the Erlang ecosystem, so you will have to approach it yourself.
> Unfortunately I don't have any suggestion handy, since it is too long since
> I've last worked with XML as a format.
> * Your Erlang node() needs to have a way to shed load once it reaches
> capacity. In other words, you design your system up to a certain amount of
> simultaneous users and then you make sure there is a limit to how much
> processing that can happen concurrently. This frames the erlang system so
> it does not break down under the stress if it gets loaded over capacity.
> Fred Hebert has written a book, "Erlang in Anger" which touches on the
> subject in chapter 3 - "Planning for overload". You may have 20.000 users
> on the system, but if you make sure only 100 of those can process XML data
> at the same time, you can at most have 240 megabytes of outstanding memory
> space at the moment. Also, you may want to think about how much time it
> will take K cores to chew through 240 megabytes of data. Reading data is
> Irina Guberman (from Ubiquity networks if memory serves) recently had a
> very insightful (and funny!) talk on how she employed the "jobs"
> framework in a situation which is slightly akin to yours. It is highly
> recommended, since she touches on the subject in far more depth than what I
> do here. For a production system I would recommend employing some kind of
> queueing framework early on. Otherwise, you system will just bow under the
> load once it gets deployed.
>  http://www.erlang-in-anger.com/
> erlang-questions mailing list
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the erlang-questions