[erlang-questions] TCP receive buffer: erlang size VS kernel size

Vincent de Phily vincent.dephily@REDACTED
Mon Aug 29 12:32:39 CEST 2011


On Thursday 25 August 2011 15:08:01 Vincent de Phily wrote:
> 1) The size seen in inet:getopts(S, [recbuf]) is the double of the size set
>    via gen_tcp:listen/2 (with some min/max values).
> 
> * I assume that the discrepancy is between an in-VM buffer and the in-kernel
> buffer ?
> * Why is that 2x ratio necessary ?
> * The docs never mention that there are two different buffers, and which
>   function reads/writes which buffer size. I only expected the kernel
> buffer, and I definitely expected 'recbuf' to represent the same object for
> inet:getopt as for gen_tcp:listen.

After reading `man tcp` it appears that this 2x ratio is a kernel-side thing :
  "Note that TCP actually allocates twice the size of the buffer requested in
   the setsockopt(2) call, and so a succeeding  getsockopt(2)  call will not
   return the same size of buffer as requested in the setsockopt(2) call.  TCP
   uses the extra space for administrative purposes and internal kernel
   structures, and the /proc file values reflect the larger sizes compared to
   the actual TCP windows."

So no issue here, everything works "as expected" and there is no double-
buffering in erlang.


> 2) If I don't specify a recbuf in gen_tcp:listen/2, the kernel buffer (as
> seen in inet:getopts) is 87 KB while the VM buffer (as seen in the size of
> received packets) is 1460 B.
> 
> * That's a much bigger difference ratio than previously seen.
> * The default value of 1460 bytes sounds too small for anything but the most
> ressource-constrained devices (my kernel agrees, but I admit having 4GB of
> memory).

I got the answer to that one reading erts/emulator/drivers/common/inet_drv.c : 
the default size of 1460 bytes for the read(2) buffer is hardcoded and is used 
if no buffer has been allocated yet and no read size had been specified by 
caller.

I understand that the buffer needs to be preallocated and that erlang is 
sometimes used in restricted-memory environements, but as-is the behaviour is 
surprising and underperforming. Instead of receiving "whatever is is the 
socket right now", we get a fraction of it. Or if we recently requested a 
specified-big ammount, the next unspecified-size request will receive as much 
data as the previous request did.

At the very least this value of 1460 bytes should be documented in the erlang 
docs ? Ideally, the read buffer size should be the same as the kernel-side 
buffer size. Can anybody share an opinion on this ?

In the meantime I learned my lesson : never gen_tcp:listen/2 without 
specifying a recbuf :)
-- 
Vincent de Phily



More information about the erlang-questions mailing list