[erlang-questions] prim_inet:close/1 race condition

Dmitry Belyaev be.dmitry@REDACTED
Thu Oct 18 14:39:10 CEST 2012


Is it known bug?
Can anyone propose a patch for prim_inet?

Thank you.

On 12.10.2012, at 1:06, Dmitry Belyaev wrote:

> Some days ago we found that we have thousands of leaked sockets in our project.
> 
> These sockets were ports with state like this:
> [{name,"tcp_inet"},
>  {links,[]},
>  {connected,<0.54.0>},...]
> 
> We made investigation and found the cause of the leaks.
> 
> We have inets option {exit_on_close, false} to read statistics from the socket after it was closed by the peer. Process that controls the socket does not trap_exit and is linked with some another process.
> At the end of connection controller calls gen_tcp:close/1 and sometimes the linked process dies at that the same moment. We found out that the gen_tcp:close/1 calls prim_inet:close/1, the first action of which is unlink from controlling process. So, when controller is unlinked from the port and is killed by the signal, port stays in the system because of exit_on_close feature.
> 
> I've made a module that sometimes may reveal the problem. https://gist.github.com/3875485
> On my system a half of dozen calls to close_bug:start(1000) do find such leaked ports. 
> I haven't found the right solution for the problem yet, so no patches at the moment.
> 
> Thank you for your attention.
> 
> -- 
> Dmitry Belyaev

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20121018/f3f91ff5/attachment.htm>


More information about the erlang-questions mailing list