Megaco simple Media Gateway bleeds memory under load.
Hakan Mattsson
hakan@REDACTED
Tue Aug 27 17:30:29 CEST 2002
On Tue, 27 Aug 2002, Peter-Henry Mander wrote:
Pete> In megaco_messenger.erl the receive_message/4 function spawns a process
Pete> for each received message. The problem I have with this scheme is that
Pete> memory is being consumed by processes spawned in receive_message/4 and
Pete> garbage-collected at a crippling rate, leading to a bottleneck in the
Pete> Media Gateway.
Pete>
Pete> The MG Controller and MG run on separate machines. The MGC is only
Pete> consuming 50%-60% CPU and has a small stable memory footprint while
Pete> issuing over 300 add-modify-subtract request cycles each second, whereas
Pete> the MG is struggling at 99% and has a huge and ever expanding memory
Pete> footprint.
Pete>
Pete> I managed to streamline the MGC by reusing processes instead of spawning
Pete> new ones. This has made it efficient enough to potentially achieve over
Pete> 500 call cycles a second, and I wonder if it were possible to use a
Pete> similar scheme in receive_message/4 and use a pool of "process received
Pete> message" processes instead of continually spawning new ones?
Pete>
Pete> Are there any issues I must be aware of before I start "hacking"
Pete> megaco_messenger.erl? Is there a better way than my (possibly naive)
Pete> proposal?
There are two drawbacks with this:
- GC. The system will need to perform more GC compared to
the current solution, where the heaps of short lived
processes cheaply can be removed instead of doing GC of long
lived processes. The inital size of the processes can be
regulated with options to megaco_messenger:receive_message/4.
- Non-safe congestion handling. The pool solution does not
really cope with the case when the MGC is able to outperform
the MG. You may possible be able to raise the current limit,
but the memory of the MG would eventually be exhausted if the
MGC persists.
I would try to push the congestion problem into the
transport layer. By explicitly blocking the socket (use
megaco_tcp:block/1 et al), the sender will back off and the
receiver will not hog more resources until you unblock the
socket again.
It should be possible to keep track of the internal resources
such as memory, number of currently handled requests
(megaco:system_info(n_active_requests)) etc. and use that
info to control the socket.
If this is not precise enough, you could hack Megaco's
transport modules or simply plug in a brand new one.
A public and congestion proof megaco_sctp module would be
nice. ;-)
/Håkan
More information about the erlang-questions
mailing list