[erlang-questions] R13B02 on 8/16 core box: all TCP communication hangs/frozen

Luke Gorrie luke@REDACTED
Thu Nov 19 15:35:00 CET 2009

2009/11/19 Scott Lystig Fritchie <fritchie@REDACTED>:
> Yes, I agree with your blame pointed at the kernel, but ... those
> sockets have a listen backlog of 4096.  Strace shows no VM activity in
> any thread when a new TCP connection is actually established.  My guess
> is that when the app does not bother calling accept(2), and then the
> listen backlog queue is full, the kernel will wait until one of the
> existing 4096 TCP connections leaves CLOSE_WAIT state, which then leaves
> an open slot in the listen queue?

Oh yes! The slow connection setup can be perfectly explained by the
application not calling accept() often enough and the listen backlog
becoming full. I just didn't understand the normal kernel behaviour
with a full backlog. Hope I didn't distract you too much :-)

FWIW I did a small experiment just to understand how Linux handles
backlogged listen sockets better. I used a debian 2.6.26-2-amd64
kernel with default settings on the server side (no special features
like SYN cookies enabled).

I started a server with a 30 connection backlog and never calling accept():

  server> gen_tcp:listen(9999, [{backlog,30}]).

Then on the client I opened 100 connections with netcat:
  client$ for i in $(seq 1 100); do (nc -w 180 server 9999 &); done

and waited a few seconds for everything to settle and then ran netstat
to see what sockets were open on each host:

server$ netstat -an | grep 9999 | awk '{print $NF}' | sort | uniq -c | sort -n
      1 LISTEN
     29 SYN_RECV

client$ netstat -an | grep 9999 | awk '{print $NF}' | sort | uniq -c | sort -n
     40 SYN_SENT

So the kernel has allowed 31 (BACKLOG+1?) connections to become
ESTABLISHED and a further 29 (BACKLOG-1?) to enter SYN_RECV and seems
to have ignored the other 40 connection requests. The client sees a
full 60 connections in ESTABLISHED state (the server's ESTABLISHED +
SYN_RECV) and the rest in SYN_SENT still resending their SYN packets.

If I wait for an application timeout in netcat to close the sockets on
the client side then the server looks like this:

server$ netstat -an | grep 9999 | awk '{print $NF}' | sort | uniq -c | sort -n
      1 LISTEN
     31 CLOSE_WAIT

Looks like the SYN_RECV sockets were discarded and the ESTABLISHED
transitioned into the half-closed CLOSE_WAIT state. So if the server
does start calling accept() it'll process 31 sockets that are already
closed by the client. (Fair enough.)

The SYN_RECV sockets make me curious. Linux seems to have a second
backlog queue for when the normal one fills up. Connections in this
queue seem to send a SYN+ACK (because the client reaches ESTABLISHED)
but wait in SYN_RECV. The code in net/ipv4/tcp_ipv4.c's
tcp_v4_conn_request refers to this as a "syn queue" containing "warm
entries", but I haven't dug deep enough to see how it really works.
Does anyone know? (Per Hedeland? :-))

I posted a tcpdump at http://fresh.homeunix.net/~luke/misc/9999.pcap
in case anyone else is having a "geek out on Linux networking" week.


More information about the erlang-questions mailing list