[erlang-questions] CPU load of TCP server
Kevin A. Smith
Thu Oct 15 01:32:45 CEST 2009
I wonder if using prim_inet:async_accept/2 would help reduce CPU usage.
On Oct 14, 2009, at 1:39 AM, Valentin Micic wrote:
> Hi Andrey,
> From your code snippet I suspect that you've opened your socket with
> {active, true} flag, which may explain excessive CPU usage -- it is
> much
> cheaper to keep excess messages in a TCP buffer, rather than as
> bunch of
> messages waiting on a process queue (for example, messages in the
> TCP buffer
> have no effect on selective receive).
> Thus, changing your code to:
> loop(Socket) ->
> receive
> {tcp, Socket, _Packet} ->
> inet:setopts( Socket, [{active, once}] ),
> loop(Socket);
> {tcp_closed, Socket} ->
> normal;
> _ ->
> loop(Socket)
> after 500 ->
> gen_tcp:send(Socket,[?PACKET]),
> loop(Socket)
> end.
> Should save some CPU cycles... don't forget to open this socket with
> initial
> {active, once}.
> As for performance difference between R12 and R13 -- not sure, but I
> think
> there is more tax to be paid for pushing bunch of messages around to a
> scheduler with a right queue.
> V/
> -----Original Message-----
> From: erlang-questions@REDACTED [mailto:erlang-
> questions@REDACTED] On
> Behalf Of Andrey Tsirulev
> Sent: 13 October 2009 08:34 PM
> To: Rapsey; erlang-questions@REDACTED
> Subject: Re: [erlang-questions] CPU load of TCP server
> Hi Sergej,
> Thanks for the hint. I moved timer to client. Now I have about 5.5%
> of CPU
> usage per each 1000 connections. I still expect it should be less..
> Andrey
> ----- Original Message -----
> From: "Rapsey" <rapsey@REDACTED>
> To: <erlang-questions@REDACTED>
> Sent: Tuesday, October 13, 2009 9:50 PM
> Subject: Re: [erlang-questions] CPU load of TCP server
>> Every time after gets executed a timer gets created (I presume).
>> With 10k
>> processes it probably makes a noticeable CPU impact.
>> Sergej
>> 2009/10/13 Andrey Tsirulev <andrey@REDACTED>
>>> Hello all,
>>> I'm exploring the possibility of using Erlang for my TCP service
>>> application (actually the game server). I've prepared test server
>>> and
>>> client
>>> applications. The test server application accepts client
>>> connections and
>>> sends 2 small (<1 Kb) packets per second to each client (and
>>> receives
>>> answers).
>>> I've met the following problems:
>>> 1) Kernel polling doesn't give any benefit with R13B02-1.
>>> 2) CPU load is too high.
>>> All the details are below.
>>> Here's my test server's `uname -a`:
>>> Linux source 2.6.29-gentoo-r5 #1 SMP Tue Aug 18 01:15:17 MSD 2009
>>> x86_64
>>> AMD Sempron(tm) Dual Core Processor 2200 AuthenticAMD GNU/Linux
>>> (I've made tests also with 2 other linux servers with different
>>> kernel
>>> versions and results were close).
>>> I've made server connection processes as simple and possible. I've
>>> tried
>>> up
>>> to 10000 concurrent connections.
>>> Test results didn't not show any visible difference between using
>>> multiple
>>> remote machines for client connections, one remote machine or
>>> localhost.
>>> I tried R13B02-1 and R12B-5 OTP versions.
>>> I found that memory usage grow is linear, as expected. But I came
>>> to the
>>> problem with CPU load.
>>> First of all, kernel polling didn't give any benefit for R13B02-1
>>> (while
>>> erlang:system_info(kernel_poll) returned true and erl started with
>>> message
>>> [kernel-poll:true]). I've got about 55% of CPU usage with 4000
>>> connections
>>> both with and without kernel polling enabled, while with R12B-5 I
>>> have
>>> about
>>> 26% of CPU usage with +Ktrue. I suspect a bug either in OTP or in
>>> gentoo
>>> ebuild (of course it's also quite possible that I'm doing
>>> something wrong
>>> or
>>> missed something in docs).
>>> The following is about R12B-5. I get about 6-7% of CPU load per
>>> every
>>> 1000
>>> connections (about 60% CPU load for 10000 connections). I'm not
>>> sure if I
>>> should consider this as a good result or a bad one. Most of the
>>> articles
>>> on
>>> the same subject say that CPU load is negligible in their tests
>>> and they
>>> are
>>> fighting for memory only, so I expected I won't be CPU-limited
>>> too, but
>>> evidently I am.
>>> `top` says that about 50% of CPU load is userspace, 25% software
>>> interrupts, 20% system and 5% hardware interrupts (that's by eye,
>>> not
>>> very
>>> strict).
>>> I found that CPU load depends not as much on connection count but on
>>> transmitted packet count (ok, that's obviously the number of system
>>> calls).
>>> Thus if I send 4 packets per second, not 2, I should decrease the
>>> number
>>> of
>>> connections twice to preserve the same CPU load.
>>> CPU load does not depend on packet size. 1 byte or 1Kbyte - no
>>> visible
>>> difference.
>>> CPU usage is slightly less with active socket option enabled than
>>> with
>>> blocking recvs.
>>> CPU usage on the single windows client machine with 4000 connections
>>> spawned is on the same level as with the linux server handling
>>> these 4000
>>> connections (while I expected linux to perform better).
>>> Switching Nagle on and off had no effect. I also tried to tune TCP
>>> stack
>>> with sysctl using advises found here and there but almost without
>>> any
>>> effect
>>> too.
>>> I've tried to trace with fprof and found that bottlenecks are 'send'
>>> operations (but I'm a relative novice to erlang so I'm not sure my
>>> usage
>>> of
>>> fprof was correct). Ok, that was expected too. I've read the 'why is
>>> gen_tcp:send slow?' thread but none of advises given there helped
>>> me.
>>> So the main question is: is the CPU usage of 7% per 1000
>>> connections (or
>>> maybe better say 2000 packets per second) a good result? If no,
>>> what is
>>> the
>>> expected result? How can I improve my test application? Or maybe
>>> something
>>> in my story looks strange?
>>> I know that the possible optimization is decreasing the number of
>>> packets
>>> and keep it in mind.
>>> Here's the server connection process loop:
>>> loop(Socket) ->
>>> receive
>>> {tcp, Socket, _Packet} ->
>>> loop(Socket);
>>> {tcp_closed, Socket} ->
>>> normal;
>>> _ ->
>>> loop(Socket)
>>> after 500 ->
>>> gen_tcp:send(Socket,[?PACKET]),
>>> loop(Socket)
>>> end.
>>> Client loop has blocking recv and answers with send immediately.
>>> Thank you very much for your time. Sorry for too many words, I
>>> tried to
>>> provide all possible information. I will answer any question and
>>> appreciate
>>> any hint.
>>> Best regards,
>>> Andrey
> ________________________________________________________________
> erlang-questions mailing list. See http://www.erlang.org/faq.html
> erlang-questions (at) erlang.org
> ________________________________________________________________
> erlang-questions mailing list. See http://www.erlang.org/faq.html
> erlang-questions (at) erlang.org
More information about the erlang-questions
mailing list