[erlang-questions] Incoming TCP connections are closed immediately, requiring SO_PRIORITY on newer Linux kernel

Dmitry Simonov dimmoborgir@REDACTED
Wed Jul 18 06:56:30 CEST 2018


Hello!

I've met strange behaviour of RabbitMQ application on Linux kernel 4.14.46.
Rabbit-users mailing list recommended ask here, probably it's Erlang's
behaviour.

Symptoms:
Incoming TCP connections are established, and immediately closed. In strace
(for /usr/lib/erlang/erts-10.0/bin/beam.smp process) there are failures
setting SO_PRIORITY socket option:

[pid  1814] accept(58, {sa_family=AF_INET, sin_port=htons(43054),
sin_addr=inet_addr("127.0.0.1")}, [16]) = 13
[pid  1814] epoll_ctl(4, EPOLL_CTL_MOD, 58, {EPOLLONESHOT, {u32=58,
u64=20331670804627514}}) = 0
[pid  1814] fcntl(13, F_GETFL)          = 0x2 (flags O_RDWR)
[pid  1814] fcntl(13, F_SETFL, O_RDWR|O_NONBLOCK) = 0
[pid  1814] getsockopt(58, SOL_TCP, TCP_NODELAY, [0], [4]) = 0
[pid  1814] getsockopt(58, SOL_SOCKET, SO_KEEPALIVE, [0], [4]) = 0
[pid  1814] getsockopt(58, SOL_SOCKET, SO_PRIORITY, [0], [4]) = 0
[pid  1814] getsockopt(58, SOL_IP, IP_TOS, [0], [4]) = 0
[pid  1814] getsockopt(13, SOL_SOCKET, SO_PRIORITY, [0], [4]) = 0
[pid  1814] getsockopt(13, SOL_IP, IP_TOS, [0], [4]) = 0
[pid  1814] setsockopt(13, SOL_IP, IP_TOS, [0], 4) = 0
[pid  1814] setsockopt(13, SOL_SOCKET, SO_PRIORITY, [0], 4) = -1 EPERM
(Operation not permitted)
[pid  1814] getsockopt(13, SOL_SOCKET, SO_PRIORITY, [0], [4]) = 0
[pid  1814] getsockopt(13, SOL_IP, IP_TOS, [0], [4]) = 0
[pid  1814] setsockopt(13, SOL_SOCKET, SO_PRIORITY, [0], 4) = -1 EPERM
(Operation not permitted)
[pid  1814] getsockopt(13, SOL_SOCKET, SO_LINGER, {onoff=0, linger=0}, [8])
= 0
[pid  1814] close(13)                   = 0

With older Linux kernel (4.9.86), connections work well works fine:

[pid  1665] accept(58, {sa_family=AF_INET, sin_port=htons(48170),
sin_addr=inet_addr("127.0.0.1")}, [16]) = 12
[pid  1665] epoll_ctl(4, EPOLL_CTL_MOD, 58, {EPOLLONESHOT, {u32=58,
u64=20331670804627514}}) = 0
[pid  1665] fcntl(12, F_GETFL)          = 0x2 (flags O_RDWR)
[pid  1665] fcntl(12, F_SETFL, O_RDWR|O_NONBLOCK) = 0
[pid  1665] getsockopt(58, SOL_TCP, TCP_NODELAY, [0], [4]) = 0
[pid  1665] getsockopt(58, SOL_SOCKET, SO_KEEPALIVE, [0], [4]) = 0
[pid  1665] getsockopt(58, SOL_SOCKET, SO_PRIORITY, [0], [4]) = 0
[pid  1665] getsockopt(58, SOL_IP, IP_TOS, [0], [4]) = 0
[pid  1665] getsockopt(12, SOL_SOCKET, SO_PRIORITY, [0], [4]) = 0
[pid  1665] getsockopt(12, SOL_IP, IP_TOS, [0], [4]) = 0
[pid  1665] setsockopt(12, SOL_IP, IP_TOS, [0], 4) = 0
[pid  1665] setsockopt(12, SOL_SOCKET, SO_PRIORITY, [0], 4) = 0
[pid  1665] getsockopt(12, SOL_SOCKET, SO_PRIORITY, [0], [4]) = 0
[pid  1665] getsockopt(12, SOL_IP, IP_TOS, [0], [4]) = 0
[pid  1665] setsockopt(12, SOL_SOCKET, SO_PRIORITY, [0], 4) = 0
[pid  1665] getsockopt(12, SOL_SOCKET, SO_PRIORITY, [0], [4]) = 0
[pid  1665] getsockopt(12, SOL_IP, IP_TOS, [0], [4]) = 0
[pid  1665] setsockopt(12, SOL_SOCKET, SO_KEEPALIVE, [0], 4) = 0
[pid  1665] setsockopt(12, SOL_IP, IP_TOS, [0], 4) = 0
[pid  1665] setsockopt(12, SOL_SOCKET, SO_PRIORITY, [0], 4) = 0
[pid  1665] getsockopt(12, SOL_SOCKET, SO_PRIORITY, [0], [4]) = 0
[pid  1665] getsockopt(12, SOL_IP, IP_TOS, [0], [4]) = 0
[pid  1665] setsockopt(12, SOL_TCP, TCP_NODELAY, [0], 4) = 0
[pid  1665] setsockopt(12, SOL_SOCKET, SO_PRIORITY, [0], 4) = 0
[pid  1665] getsockopt(58, SOL_IPV6, IPV6_TCLASS, 0x7f88c14fc7c8,
0x7f88c14fc7cc) = -1 EOPNOTSUPP (Operation not supported)
[pid  1665] accept(58, 0x7f88c14feaf0, 0x7f88c14feac4) = -1 EAGAIN
(Resource temporarily unavailable)
[pid  1665] epoll_ctl(4, EPOLL_CTL_MOD, 58, {EPOLLIN|EPOLLONESHOT,
{u32=58, u64=14125640596043333690}}) = 0
[pid  1665] recvfrom(12, 0x7f88c43489e8, 1460, 0, NULL, NULL) = -1
EAGAIN (Resource temporarily unavailable)
[pid  1665] epoll_ctl(4, EPOLL_CTL_ADD, 12, {EPOLLIN|EPOLLONESHOT,
{u32=12, u64=14125640596043333644}}) = 0
[pid  1665] futex(0x7f88c3f811d0, FUTEX_WAIT_PRIVATE, 4294967295, {0,
151060429}) = -1 ETIMEDOUT (Connection timed out)
[pid  1665] futex(0x7f88c3f811d0, FUTEX_WAKE_PRIVATE, 1) = 0
[pid  1665] futex(0x7f88c3f813d0, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>


Setting these capabilities explicitly (setcap cap_net_admin+ep
/usr/lib/erlang/erts-10.0/bin/beam.smp) makes RabbitMQ to work again (TCP
connections are not closed any more).

Could you please help?
Why does this problem occur?

Erlang version is 21 (latest):
# erl -sname test
Erlang/OTP 21 [erts-10.0] [source] [64-bit] [smp:2:2] [ds:2:2:10]
[async-threads:1]

Eshell V10.0  (abort with ^G)

RabbitMQ version: 3.7.7-1 (latest).

-- 
Best Regards,
Dmitry Simonov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20180718/56483d75/attachment.htm>


More information about the erlang-questions mailing list