[erlang-questions] gen_tcp receive extremely high CPU

Anita Wong <>
Fri Oct 12 10:44:18 CEST 2012


Thanks for the suggestions.

30% isn't much, but when I establish 50000 connections, the CPU becomes %100. When I simulate some messages, the CPU becomes 180%, and everything jammed.

Our production Node.js server serving 70000 connections with ping-pong and messages, most of the time got <5% CPU. That's the reason why I think 30% is a lot, (and it won't stop until I stop the ping pong).

I've tried removing the "after" as Sergej said, no difference.

I'm reading the doc about the "lock" thing, praying that it can help...

I've even tried running the test on http://www.trapexit.org/Building_a_Non-blocking_TCP_server_using_OTP_principles, but it gives same result… I think it may be a configuration / compile problem...

Anita WONG
Senior System Engineer

let's try TalkBox. my username is "anita"

TalkBox Limited | www.talkboxapp.com
D: +852.3526.0238  | M: +852.9821.5819 
E: 
Facebook: TalkBox App | Twitter: TalkBoxApp | Weibo: TalkBox

On 12 Oct, 2012, at 4:16 PM, Rapsey <> wrote:

> Yeah 30% is really not that much. There is one optimization you can do however. If you have an after block, this means erlang will be setting up a new timer for every packet. I would use erlang:send_after and have a constant timer that only gets executed every X seconds.
> 
> When it comes to serving network traffic erlang will surely work very well. I would recommend you take a look at the cowboy server and build on that (can be used as a http server or just plain TCP).
> 
> 
> Sergej
> 
> On Fri, Oct 12, 2012 at 9:55 AM, Gleb Peregud <> wrote:
> On Fri, Oct 12, 2012 at 9:48 AM, Anita Wong <> wrote:
> > Hi All,
> >
> > Sorry that I'm an Erlang newbie and may make stupid question. But please
> > help me to solve the issue.
> >
> > I have written an Erlang server to replace the one I'm using with Node.js,
> > which ate all my memory and I'm praying that Erlang could be a way out.
> > The server works properly under unit test and internal testing, but face a
> > high CPU usage in stress test.
> >
> > After trimming down, I found that the CPU burst was due to the TCP receive
> > from clients.
> >
> > receiveClientPacket(Sock) ->
> >   inet:setopts(Sock, [{active, once}, {buffer, ?CLIENTHEARTBEATSIZE}]),
> >   receive
> >     {tcp, Sock, Data} ->
> >       {ok, Data};
> >     {tcp_closed, Sock} ->
> >       {error, closed}
> >     after ?CLIENTRECCEIVETIMEOUT ->
> >       {error, timeout}
> >   end.
> >
> >
> > I tried making the process sleep for 10 hours at the beginning of the
> > function (to prevent it from calling receive), the CPU didn't burst at all.
> > Therefore I conclude that the burst of CPU is due to TCP receive. (Please
> > correct me if I made any mistake)
> >
> > Here are information about my stress test:
> >
> > start the Erlang server with:
> >
> > erl +zdbbl 2097151 -K true +A 128 +P 5000000
> >
> > connect 5000 clients to the Erlang server
> > each connected client sends a 2 byte data to the server every 1 min
> > after all the connections is done, (i.e. only the 2 byte data per min are
> > performing), the CPU burst to ~30%sy (from "top")
> >
> >
> > I'm using an Amazon Linux AMI (large instance, 64-bit) for the Erlang
> > server. Is the burst due to the linux? As I have no idea how the system will
> > use up the CPU. Or is it my poor code's problem? (I believe so...)
> >
> > In real situation, our servers don't only receive ping pong, but also
> > messages, which is a lot more loading... This is only the first step...
> >
> > Millions of thanks to anyone who can save me.
> >
> > Anita~*
> >
> > ~~~~~~~~~~~~~~~~~~~~~~~
> >
> > Information about large instance (for reference):
> > 7.5 GB memory
> > 4 EC2 Compute Units (2 virtual cores with 2 EC2 Compute Units each)
> > 850 GB instance storage
> > 64-bit platform
> > I/O Performance: High
> 
> Hello Anita
> 
> Please try rewriting the code to use {active, false} and gen_tcp:recv
> instead and see if it helps. Also high CPU usage might be due to lock
> contention of the VM - try running a VM with lock-counters enabled (
> http://www.erlang.org/doc/apps/tools/lcnt_chapter.html ) to see if
> there's some lock which is contended.
> 
> But the most important question - is 30% CPU burst really a problem?
> 30% is usually not a big CPU usage, unless you need some sort of real
> time guarantees. Also Erlang VM uses spin locks and sometimes it uses
> more CPU than similar code in other language, but spin locks are used
> to reduce latency.
> 
> Best regards,
> Gleb Peregud
> _______________________________________________
> erlang-questions mailing list
> 
> http://erlang.org/mailman/listinfo/erlang-questions
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20121012/488adb8d/attachment.html>


More information about the erlang-questions mailing list