[erlang-questions] how to flush a gen_tcp socket on close?

Per Hedeland <>
Sat Apr 7 14:48:35 CEST 2012

Sheesh, once again I find that if you want something done properly, you
have to do it yourself.:-) Here I have three people, including the
author of inet_drv, and a demo program, (and OK, some manual
send()/recv()/close()-calling of mine in the shell seemed to confirm it)
telling me that gen_tcp may lose data on close() - and it's simply not

Short summary: It seems that the loop-wait in prim_inet is just smoke
and mirrors, and should be removed - because inet_drv will *not* discard
the user-level queue and close the OS socket when the port is closed. It
faithfully keeps on sending from the queue as long as the OS socket is

And as long as the node isn't stopped, of course - maybe that was the
OP's problem? In which case I think the question should have been "how
to know when all buffered data is sent and acked by the receiver?".
Some more comments below for those that aren't fed up with the subject

Tony Rogvall <> wrote:
>Remember that inet_drv is not only used by gen_tcp but is also used by distribution.
>The internal port queues are a great place to push stuff when you should be blocking,
>but are not really ready to do that yet :-)

But I still find the way they are currently working to be strange /
unexpected. If the socket buffer is "almost full", and we do two send()
with ~ 5kB each, the second one will block and mark the port busy (due
to the default 8kB high-water mark) - while doing a single send() with
10 *MB* will *not* block. My patch, which I still think is good, changes
this - send() will block and the port be marked busy as soon as the
high-water mark is reached, regardless of the state of the queue before
the send().

>To push wouldblock back to Erlang could be a way of handling the problem
>and let Erlang get the POLLOUT signal etc. But WE did not design it that way :-)

Ouch no, we certainly don't want that!

>Plenty of history around the inet_drv, not saying it is doing the correct thing or even close
>but it is an explanation.

I now - just as before this thread - think it is doing very close to the
right thing. But I'm not sure if anyone really understands all the
details of *how* it does it anymore...

>While data is queued in user land and some stop the node (init:stop or ^C ...)
>This data will be discarded, while if data made it into kernel it will not, 
>I do not see how this can be fixed (except using queues in user land)

Hm, did you leave out a "not", i.e. "except not using queues in user
land"? Otherwise I can't quite follow you. Anyway, at least having the
choice to not use queues in user land would be good, I think - and with
my patch you have it, by setting the high/low-water marks to 1/0.

>On 6 apr 2012, at 12:39, Per Hedeland wrote:
>> But it's reasonable to expect that if the networks, hosts, and
>> applications keep running, and the user doesn't close the tab in his
>> browser, that jpeg *should* eventually be displayed in full even if the
>> user is on a slow dialup and the sender has long since completed his
>> close() call and gone on to other business (or even called it a day).
>> gen_tcp:close/1 doesn't meet this expectation.

Not by me anymore.:-) Well, except for "called it a day" I guess.

>> I will maintain that the only reason to use shutdown is when you want to
>> do a "half close", or even only when you want to do a "half close in the
>> send direction". What's the point otherwise?
>Of course, I do not see your point ?
>I will maintain that if you want to be sure that the other side actually got all your data
>then use shutdown followed by "wait for close". 

I *think* that you are talking about doing shutdown(write) + an
expectation that the receiver will close when he gets tcp_closed/EOF +
waiting for the tcp_closed resulting from that. That's a nice way to
make sure that you aren't discarding anything due to stopping the node
prematurely, but I'm not sure it guarantees that the other side got
everything (e.g. is it possible to distinguish the case that he crashed
before that?).

But anyway my point was that it is completely point*less* to do a
shutdown(read_write) before the close(), or even to *ever* do a
shutdown(read_write) - and you didn't say otherwise, I just read
something that wasn't there.

>> IMO the *fix* is to not have a user-level queue at all, and thus always
>> have gen_tcp:send/2 block until everything has been written to the
>> socket. I don't expect this fix to happen though, but it should at least
>> be possible to disable the user-level queue, by passing options
>> {high_watermark, 1}, {low_watermark, 0}. However this doesn't work the
>> way the code is currently written - send() will never block if the queue
>> is empty before the call. The patch below fixes this, and with it I can
>> run Matt's test successfully (after fixing some "bad" assumptions) if I
>> use the watermark-setting.
>Cool. Lets see if this "remove inet_drv queue" is reasonable,  did you
>check the effects on erlang distribution ?

No - but the patch does *not* "remove inet_drv queue", only makes it
possible to disable it if you really want to. And it changes the
blocking behavior of send() per above. I don't *think* that the patch
per se should have any significant effect on the distribution - having
the distribution disable the queue may well have some bad effect, but I
don't see any reason to do that.

>> Both of these suffer from the effect of "silly window avoidance" - i.e.
>> even if they improve the coupling "more data is sent" => "send queue
>> shrinks", they do not help with the coupling "more data is read" =>
>> "more data is sent". And they don't at all address the "keep trying as
>> long as the receiver is alive" goal.
>Is there any way of mimic the kernel in this sense ?

Not that I know of.


More information about the erlang-questions mailing list