[erlang-questions] gen_tcp receive extremely high CPU

Gleb Peregud gleber.p@REDACTED
Fri Oct 12 09:55:17 CEST 2012


On Fri, Oct 12, 2012 at 9:48 AM, Anita Wong <anita.wong@REDACTED> wrote:
> Hi All,
>
> Sorry that I'm an Erlang newbie and may make stupid question. But please
> help me to solve the issue.
>
> I have written an Erlang server to replace the one I'm using with Node.js,
> which ate all my memory and I'm praying that Erlang could be a way out.
> The server works properly under unit test and internal testing, but face a
> high CPU usage in stress test.
>
> After trimming down, I found that the CPU burst was due to the TCP receive
> from clients.
>
> receiveClientPacket(Sock) ->
>   inet:setopts(Sock, [{active, once}, {buffer, ?CLIENTHEARTBEATSIZE}]),
>   receive
>     {tcp, Sock, Data} ->
>       {ok, Data};
>     {tcp_closed, Sock} ->
>       {error, closed}
>     after ?CLIENTRECCEIVETIMEOUT ->
>       {error, timeout}
>   end.
>
>
> I tried making the process sleep for 10 hours at the beginning of the
> function (to prevent it from calling receive), the CPU didn't burst at all.
> Therefore I conclude that the burst of CPU is due to TCP receive. (Please
> correct me if I made any mistake)
>
> Here are information about my stress test:
>
> start the Erlang server with:
>
> erl +zdbbl 2097151 -K true +A 128 +P 5000000
>
> connect 5000 clients to the Erlang server
> each connected client sends a 2 byte data to the server every 1 min
> after all the connections is done, (i.e. only the 2 byte data per min are
> performing), the CPU burst to ~30%sy (from "top")
>
>
> I'm using an Amazon Linux AMI (large instance, 64-bit) for the Erlang
> server. Is the burst due to the linux? As I have no idea how the system will
> use up the CPU. Or is it my poor code's problem? (I believe so...)
>
> In real situation, our servers don't only receive ping pong, but also
> messages, which is a lot more loading... This is only the first step...
>
> Millions of thanks to anyone who can save me.
>
> Anita~*
>
> ~~~~~~~~~~~~~~~~~~~~~~~
>
> Information about large instance (for reference):
> 7.5 GB memory
> 4 EC2 Compute Units (2 virtual cores with 2 EC2 Compute Units each)
> 850 GB instance storage
> 64-bit platform
> I/O Performance: High

Hello Anita

Please try rewriting the code to use {active, false} and gen_tcp:recv
instead and see if it helps. Also high CPU usage might be due to lock
contention of the VM - try running a VM with lock-counters enabled (
http://www.erlang.org/doc/apps/tools/lcnt_chapter.html ) to see if
there's some lock which is contended.

But the most important question - is 30% CPU burst really a problem?
30% is usually not a big CPU usage, unless you need some sort of real
time guarantees. Also Erlang VM uses spin locks and sometimes it uses
more CPU than similar code in other language, but spin locks are used
to reduce latency.

Best regards,
Gleb Peregud



More information about the erlang-questions mailing list