strange packet loss

Joel Reymont joelr1@REDACTED
Tue Aug 11 18:17:14 CEST 2009


I'm facing a strange issue where packets are sent over the loopback  
interface from server to client but the client does not receive them.  
I've been scratching my head over this for a while now and I just  
can't see what's wrong. I'm running 10k clients and this only happens  
to 100-500 of them.

I'm sending a subscribe packet to the server and the server replies  
with an ACK. If the server does not reply, the client will resend the  
subscription request.

According to tcpdump:

---
biggie:~ joelr$ sudo tcpdump -i lo0 -vv -X tcp port 47835

tcpdump: listening on lo0, link-type NULL (BSD loopback), capture size  
65535 bytes
17:04:19.009254 IP (tos 0x0, ttl 64, id 49634, offset 0, flags [DF],  
proto TCP (6), length 92, bad cksum 0 (->7ab7)!)
     localhost.47835 > localhost.us-cli: Flags [P.], cksum 0xfe50  
(incorrect -> 0x050a), seq 4234590488:4234590528, ack 2064297246, win  
40830, options [nop,nop,TS val 960438746 ecr 960438703], length 40
	0x0000:  4500 005c c1e2 4000 4006 0000 7f00 0001  E..\..@REDACTED@.......
	0x0010:  7f00 0001 badb 1f92 fc66 b918 7b0a ad1e  .........f..{...
	0x0020:  8018 9f7e fe50 0000 0101 080a 393f 21da  ...~.P......9?!.
	0x0030:  393f 21af 0026 7b22 6163 7469 6f6e 223a  9?!..&{"action":
	0x0040:  2273 7562 7363 7269 6265 222c 2264 6174  "subscribe","dat
	0x0050:  6122 3a22 6576 656e 7473 227d            a":"events"}
17:04:19.009265 IP (tos 0x0, ttl 64, id 28825, offset 0, flags [DF],  
proto TCP (6), length 52, bad cksum 0 (->cc28)!)
     localhost.us-cli > localhost.47835: Flags [.], cksum 0xfe28  
(incorrect -> 0x0a4a), seq 1, ack 40, win 65535, options [nop,nop,TS  
val 960438746 ecr 960438746], length 0
	0x0000:  4500 0034 7099 4000 4006 0000 7f00 0001  E..4p.@REDACTED@.......
	0x0010:  7f00 0001 1f92 badb 7b0a ad1e fc66 b940  ........{....f.@
	0x0020:  8010 ffff fe28 0000 0101 080a 393f 21da  .....(......9?!.
	0x0030:  393f 21da                                9?!.
17:04:19.009795 IP (tos 0x0, ttl 64, id 40175, offset 0, flags [DF],  
proto TCP (6), length 57, bad cksum 0 (->9fcd)!)
     localhost.us-cli > localhost.47835: Flags [P.], cksum 0xfe2d  
(incorrect -> 0x7df6), seq 1:6, ack 40, win 65535, options [nop,nop,TS  
val 960438746 ecr 960438746], length 5
	0x0000:  4500 0039 9cef 4000 4006 0000 7f00 0001  E..9..@REDACTED@.......
	0x0010:  7f00 0001 1f92 badb 7b0a ad1e fc66 b940  ........{....f.@
	0x0020:  8018 ffff fe2d 0000 0101 080a 393f 21da  .....-......9?!.
	0x0030:  393f 21da 0003 4143 4b                   9?!...ACK
---

You can see that the client sends a subscription request and that the  
server replies with an ACK.

According to Erlang, the send portion of the stats keeps changing but  
there's nothing received.

---
(debug@REDACTED)11> inet:getstat(Sock).
{ok,[{recv_oct,0},
      {recv_cnt,0},
      {recv_max,0},
      {recv_avg,0},
      {recv_dvi,0},
      {send_oct,1044},
      {send_cnt,27},
      {send_max,40},
      {send_avg,38},
      {send_pend,0}]}
(debug@REDACTED)12> inet:getstat(Sock).
{ok,[{recv_oct,0},
      {recv_cnt,0},
      {recv_max,0},
      {recv_avg,0},
      {recv_dvi,0},
      {send_oct,1124},
      {send_cnt,29},
      {send_max,40},
      {send_avg,38},
      {send_pend,0}]}
---

I don't think it's the socket options:

---
(debug@REDACTED)15> inet:getopts(Sock, [active]).
{ok,[{active,true}]}

(debug@REDACTED)18> inet:getopts(Sock, [reuseaddr]).
{ok,[{reuseaddr,true}]}

(debug@REDACTED)19> inet:getopts(Sock, [packet]).
{ok,[{packet,2}]}
---

The initial set of options setup looks like this:

---
     case gen_tcp:connect(State#state.host,
                          State#state.port,
                          [binary,
                           {packet, 0},
                           {active, true},
                           {reuseaddr, true}
                          ], 3000) of
---

Any suggestions on how to solve this issue?

	Thanks, Joel

P.S. Snow Leopard, 64-bit mode, OTP R13B01

Darwin biggie.local 10.0.0 Darwin Kernel Version 10.0.0: Sat Jul 18  
23:34:57 PDT 2009; root:xnu-1456.1.22~1/RELEASE_X86_64 x86_64

Erlang R13B01 (erts-5.7.2) [source] [64-bit] [rq:1] [async-threads:0]  
[kernel-poll:true]

---
Mac hacker with a performance bent
http://www.linkedin.com/in/joelreymont



More information about the erlang-questions mailing list