{error,closed} vs. {error,econnreset}

Reviving this old thread again, because I am getting more and more convinced that we need further changes.
We're still observing connection close events when an error should be reported to gen_tcp level.
It can be a reset error somehow still not reported as 'econnreset', but I suspect it must be some other error.

I checked the code in inet_drv.c. The function
static int tcp_recv(tcp_descriptor* desc, int request_len)
seems to work properly -- a reset is either reported as closed or econnreset, depending on show_econnreset, all other errors are reported as errors.

However, the function
static int tcp_send_or_shutdown_error(tcp_descriptor* desc, int err)
hides errors. Connection reset errors are properly handled (either reported as closed or econnreset, depending on show_econnreset), but all other errors are just reported as closed.
Active and passive modes have independent code paths, but I think both do the same: all errors are reported as normal close -- except for econnreset.

Apparently I need to detect all errors.
Is it possible to implement a show_errors (or show_all_errors) flag, too?

Actually, this new flag could replace the current show_econnreset flag.
Having two separate flags for econnreset & others requires more complex code, but having a single show_errors flag would simplify the current that provides special treatment for econnreset.
I am not sure if it makes much sense to expose connection reset errors but still mask all other errors as normal close events.

>From a farther point of view, it seems there are network-programming tasks (there is at least one!), for which Erlang seems not suitable. This sounds rather sad.
Luckily the fix doesn't seem difficult.

What do you think?


Hi Andras,

I'm answering this because you addressed my directly.

At the current moment I am not in a position to speak with any authority
regarding the code in inet_drv.c. It's been over a year since I last looked
at this code and it would take me a good few hours to get to a point where
I was comfortable to comment on it again.

If you sniff the wire and can confirm with absolute certainty that the
peer is sending an RST that is getting reported as {error, closed} in
Erlang, then I'll make the time to look into it. Needless to say, you
would need to provide significantly more information about the scenario
in which this occurred: OS where Erlang is running, OS on the peer, all
the socket options you set, the exact call(s) where you received
{error, closed}, etc. Ideally you'd be able to provide a tcpdump trace of
when the connection went down; it wouldn't need to be a deep trace, just
enough to see the details of the TCP traffic (flags, sequence numbers,



> Hi Rory, All,
> Let me revive this old thread for a minute. The new gen_tcp option {show_econnreset, true} works well.
> However, we still notice some cases when we observe {error,closed} on the Erlang side, but other signs suggest that the TCP connection wasn't intentionally closed by the peer, but was closed because of some error.
> We suspect some packets being dropped by the OS due to various buffer overruns.
> I am not very familiar with packet-level details of TCP. Can someone confirm if there are other erroneous terminations of a TCP connection (other than econnreset), reported simply as {error,closed} by Erlang?
> I tried checking the code erts/emulator/drivers/common/inet_drv.c and it seems to me not, but can someone actually understanding that code also confirm? 
> Thank you very much,
>    Andras
> Hi Rory,
> I just tested this new feature in Erlang/OTP R18 and it works fine.
> Thank you very much all for implementing it!
> Regards,
>    Andras
> Hi Andras,
> > Thank you very much for your efforts Rory.
> > 
> > The ability "to set a socket option that shows all econnreset errors" sounds like the right solution. I am wondering why hiding this detail is the default, but I believe there were good enough reasons to design it that way.
> > 
> > I accept that your solution will not notice the connection reset event in some corner cases. I think this will not apply in my case: I am sending a small amount of data (<1KB) and wait for the reply.
> > 
> > I am looking forward to see your patch in the next release of Erlang/OTP!
> The fix for this is in the 18.0 release. It should take care of the corner cases too.
> Use the socket option '{show_econnreset, true}' and you'll receive {error, econnreset} in passive mode or {tcp_error, Socket, econnreset} in active mode. See the docs [1] for more information.
> Regards,
> Rory
> [1] http://www.erlang.org/doc/man/inet.html#setopts-2
