<div dir="ltr"><b>Problem Statement</b><br><br> When calling gen_tcp:connect/3 or /4 on a host/port that does not have a running program listening on it, at random intervals gen_tcp:connect returns an {ok, Sock} instead of the expected {error, econnrefused}. If gen_tcp:recv(Sock, 0) is called immediately using the socket just returned, it returns an {error, econnrefused}. Connection options used were <span style="font-family: courier new,monospace;">[binary, {packet, raw}, {active, false}]</span>. It should be noted that the gen_tcp:connect succeeds when there is a program listening on that sane host/port, so it's unlikely to be a firewall issue.<br>
<br><b>Reproducing the error</b><br><br>The attached test program demonstrates the bug most quickly if you run with more than one scheduler and with kernel poll enabled. <br><br>It was run in the shell as<br><br><span style="font-family: courier new,monospace;">gen_tcp_connect_bug:go(Host, Port).</span><br>
<br>It seemed to make no difference whether gen_tcp:connect/3 or gen_tcp:connect/4 was called.<br>The destination system was another computer on the same local subnet, which was running Windows XP.<br>Running the test with only one scheduler (+S 1) didn't return a false positive in over 4 million connect() attempts (after which I terminated the program). It may be reasonable to assume that the issue won't appear with one scheduler.<br>
<ul><li>erl +S 4 +K true: false positive returned in between 1 and a few thousand attempts.</li><li>erl +S 4 +K false: false positive returned in around 70,000 or more attempts.</li><li>erl +S 1 +K true or false: no false positive returned in > 4.2 million attempts.</li>
</ul>The attached Wireshark text file trace shows that on every occasion, gen_tcp sent
a SYN and received an RST, even for the request that returned an open
socket, so it doesn't seem to be a TCP/IP problem. The binary version of the trace is available if required.<br>
<br>
This seems to suggest that the bug is related to SMP mode, and may have something to do with 64-bit mode. I haven't got a 32-bit system to try it on. Why it returns much quicker when using kernel poll is probably due to timing.<br>
<br><b>Test Environment</b><br><ul><li>Intel Q6600, Intel XBX2 MB, 8GB RAM, On-board Broadcom Gigabit Ethernet<br></li><li>Ubuntu Linux x86_64 Gutsy (Linux 2.6.24-16-generic #1 SMP Thu Apr 10 12:47:45 UTC 2008 x86_64 GNU/Linux)</li>
<li>Erlang R12B-3</li></ul>Regards,<br>Edwin Fine<br>-- <br>For every expert there is an equal and opposite expert - Arthur C. Clarke<br>
</div>