[erlang-questions] {error,closed} vs. {error,econnreset}

Rory Byrne rory@REDACTED
Sat May 2 12:32:09 CEST 2015


Hi Andras,

On Fri, May 01, 2015 at 11:53:13AM +0000, Bekes, Andras G wrote:
> 
> Please see the trace in the attached file. Do you have suggestions on what else shall I trace?
> 
>>> On 2015-02-10 14:39, Bekes, Andras G wrote:
>>> 
>>> Looks like I have a problem with the erroneous result of gen_tcp:recv/2, as it returns {error,closed} instead of my expectation of {error,econnreset}.
>>> 
...
>>> 
>>> Unfortunately it looks like I really need to separate these two results.
>>> 
>>> I also tried gen_tcp in active mode, but no difference, the result is {error,closed} instead of {error,econnreset}.
>>> 
>>> Can someone explain why the econnreset error is masked? Is there any way I can separate the two kinds of events?
>>> 

It seems it was a design decision to mask both ECONNRESET recv errors and
all send errors. You can see where this happens for recv errors at around
line 9988 in erts/emulator/drivers/common/inet_drv.c. Open that file and
search for ECONNRESET and you will find the following section of code.

    n = sock_recv(desc->inet.s, desc->i_ptr, nread, 0);

    if (IS_SOCKET_ERROR(n)) {
        int err = sock_errno();
        if (err == ECONNRESET) {
            DEBUGF((" => detected close (connreset)\r\n"));
            return tcp_recv_closed(desc);
        }
        if (err == ERRNO_BLOCK) {
            DEBUGF((" => would block\r\n"));
            return 0;
        }
        else {
            DEBUGF((" => error: %d\r\n", err));
            return tcp_recv_error(desc, err);
        }
    }
    else if (n == 0) {
        DEBUGF(("  => detected close\r\n"));
        return tcp_recv_closed(desc);
    }

As you can see from this, ECONNRESET is being treated as though it's an EOF 
rather than an error: both end up calling tcp_recv_closed(). As a quick fix to
your problem, change the code to read:

        if (err == ECONNRESET) {
            DEBUGF((" => detected close (connreset)\r\n"));
            return tcp_recv_error(desc, err);
        }

However, this is an indiscriminate change which will effect all socket code,
including the networking code for distributed Erlang, and any third party
Erlang code you are using.

Also, be warned that this fix won't alert you to every incoming RST. For
example, if you have a large payload buffered in the socket driver in Erlang 
(i.e. a payload that is too large for the kernel socket send buffer) and you 
receive an RST, then your next call to gen_tcp:recv/2 will return 
{error, closed} rather than {error, econnreset}. See below for example code 
which shows this. The reason this happens is as follows:

 1. When there is outbound data buffered in the socket driver queue, 
    gen_tcp:send/2 becomes an asynchronous call. Which is to say, the new data
    is added to the driver queue and the call returns 'ok'. Then the actual 
    send syscall takes place at a later time, according to epoll, select or 
    whatever.

 2. Then, when you are in passive mode and an error is detected on the send 
    syscall there is no caller to return it to, so it is marked as a 
    "delayed close error". This has two consequences: 

      (a) it is masked to a generic {error, closed}; and
      (b) it is returned on the next call to gen_tcp:recv/2 or gen_tcp:send/2.

So, the send error is ultimately returned on a gen_tcp:recv/2 call, and all
send errors are masked as {error, closed}.

In active mode the problems are similar.

I've got a patch for the recv errors and I'm working on a patch for the send 
errors. Both patches will allow the user to set a socket option that shows all 
econnreset errors (changing some epipe errors to econnreset in the case of send
errors). With any luck, I'll release these patches over the next week or
two as part of a larger set of patches. No guarantee they'll be accepted 
though.

Rory

%%--- Server Code ---
-module(server).
-export([run/0]).

-define(PORT, 7777).
-define(RECBUF, (4 * 1024)).

run() ->
    SockOpts = [{recbuf, ?RECBUF}, {reuseaddr, true}, {active, false}],
    {ok, LSock} = gen_tcp:listen(?PORT, SockOpts),
    {ok, Sock} = gen_tcp:accept(LSock),
    ok = gen_tcp:close(LSock),
    ok = inet:setopts(Sock, [{linger, {true, 0}}]),
    ok = gen_tcp:send(Sock, "Any payload"),
    timer:sleep(1000),
    ok = gen_tcp:close(Sock).

%%--- Client Code ---
-module(client).
-export([run/0]).

-define(PORT, 7777).
-define(SNDBUF, (4 * 1024)).
-define(PAYLOAD_SIZE, (30 * 1024)).

run() ->
    SockOpts = [{sndbuf, ?SNDBUF}, {active, false}],
    {ok, Sock} = gen_tcp:connect("localhost", ?PORT, SockOpts),
    Payload = lists:duplicate(?PAYLOAD_SIZE, $.),
    ok = gen_tcp:send(Sock, Payload),
    ok = timer:sleep(2000),
    Res = gen_tcp:recv(Sock, 0),
    io:format("Result: ~p~n", [Res]),
    gen_tcp:close(Sock).



More information about the erlang-questions mailing list