[erlang-questions] HTTP requests and the meaning of eaddrinuse
Sat Jan 31 00:01:05 CET 2009
Sverker Eriksson <sverker@REDACTED> wrote:
>Johnny Billquist wrote:
>> Sverker Eriksson wrote:
>>> Johnny Billquist wrote:
>>>> [...] you will most probably hit the limit of the number of open
>>>> file descriptors long before you exhaust all the local port numbers.
>>>> By default on on my mac, the max file descriptors is 256 (per
>>>> process). There is also a limit on the total number of file
>>>> descriptors in the OS. Nowhere near the theoretical limit of 65536
>>>> ports in tcp. So that should give you enfile or emfile.
>>> The internal TIME_WAIT state of the TCP protocol may cause exhaustion
>>> of port numbers even though the file descriptor limit is much lower
>>> than 65536. Use the netstat command tool to view lingering
>>> connections in TIME_WAIT state.
>> True, if the connections aren't closed properly.
>Actually, kind of the opposite. The peer that actively closes the
>connection by calling close() will cause its "socket" into TIME_WAIT.
Indeed, it is required by the spec.
>It's like a quarantine to avoid late arriving packets of the old
>connection from being confused with a new connection using the same
>port. A major flaw of TCP if you ask me.
Do you have a better solution to the problem? It is a pretty important
thing to solve, since such packets could potentially cause a reset of
the new session or even corrupt the data stream without any error
indication. It could possibly be argued that the default time of
(typically) 2 minutes is pretty huge.
>> But yes, that could be it. I wonder if eaddrinuse really is returned
>> in that case. [...]
>I experienced this some time ago writing a benchmark on Linux. I'm quite
>sure it was eaddrinuse that I got when the port numbers where exhausted.
Yes, that's what you get (on connect()) on all Unices where I have seen
it, and it does make some amount of sense I think - if there are no
local ports left, any attempt to form a connection would have to use a
port that is already "in use".
The issue is actually a bit more complex than "running out of local
ports" though - it is only the complete 4-tuple that has to be unique,
i.e. multiple connections from the same local address and port is OK if
either the remote address or the remote port differs.
Checking for this uniqueness may be expensive, which could explain that
you get this problem "unnecessarily" - the stack may optimistically use
a local port that isn't actually free, on the assumption that you don't
normally make all your connections to the same remote address/port, so
it "should" work out. Except it doesn't when you are doing benchmarks
and other things that do tons of connections to the same address/port.
There are ways to tweak things in those cases though - the port range
has already been mentioned and is normally configurable
(/proc/sys/net/ipv4/ip_local_port_range on Linux) so you can actually
use the ~ 64k theoretical max, on some OSes you may be able to tinker
with the TIME_WAIT timeout (you know that packets won't hang around for
2 minutes on your LAN where you are doing the benchmark), and you may be
able to use multiple local addresses in a round-robin fashion (same port
on different local addresses will never cause a conflict).
How any of this works on Windows I have no idea though.
More information about the erlang-questions