Socket connects to itself

Rob Charlton rob@REDACTED
Mon Oct 12 20:34:02 CEST 2009


Hi,

I encountered a bug recently, which I have now fixed, but I wanted to
share it in case anyone else encounters it. I'm also curious to see if
anyone can give a better explanation of what happens.

I have a process, call it P, which uses gen_tcp:connect(Addr,Port,
[binary,{packet,4}], 1000) to try and connect to a server on the local
machine. Actually there are several of these processes and they try to
connect to a list of different servers specified in the application's
config file. If the connection times out, or fails for some reason, then
an attempt is made to connect again in 1 second using
erlang:send_after.  Normally the system is configured in such a way that
the list of servers matches those that are actually running (or will be
running), so eventually all the P processes get connected and all is well.

Sometimes, however, the system is incorrectly configured so that one of
the servers is listed in the config file but is not actually running. In
this case the process P will continually try and connect to it every
second. This is where it gets interesting.

I noticed in the logs one night that this process P which had tried for
several hours to connect to a non-existent server on 127.0.0.1:2920 had
actually succeeded! Even more exciting than that, it had connected to
itself, so when it started trying to communicate with its peer it was
very surprised to receive its own communications back again. Needless to
say P crashed at this point. I was very curious to know how this could
happen and to try and fix it! You could argue this isn't a bug because
the system configuration was incorrectly specified. In this particular
case though, the misconfiguration can happen easily and I would like P
not to crash but instead to keep trying.

I tried Googling for this and did find a couple of non-erlang sources
suggesting it had happened to them in similar circumstances. It could be
that this is a very timing sensitive 'feature' of TCP/IP. Can anyone
enlighten me?

I did manage to fix it, by testing whether inet:peername(Socket) =:=
inet:sockname(Socket) when I get a connection. Very occasionally this
evaluates to true, so in that case I close the socket and try again.

Cheers

Rob

-- 
Erlang Training and Consulting Ltd
www.erlang-consulting.com



More information about the erlang-questions mailing list