[erlang-bugs] Erlang nodes fail to communicate if user has no capability to change SO_PRIORITY socket options.

Serge Aleynikov serge@REDACTED
Tue Jan 18 05:41:57 CET 2011


At some point I was having a similar issue with SO_PRIORITY and SO_TOS 
bug in inet_drv when trying to open a unix domain socket and pass the 
open file descriptor to gen_tcp, which failed to function properly. 
After discussing this issue with Per Hedeland he sent me the attached 
patch that worked well to solve the issue.  Perhaps it will also work in 
your case, and if so, it should be included in distribution.

Serge

On 1/17/2011 1:33 PM, Janek Wrobel wrote:
> On Thu, Jan 13, 2011 at 5:22 PM, Patrik Nyblom<pan@REDACTED>  wrote:
>> Hi!
>>
>> On Thu, 13 Jan 2011, Janek Wrobel wrote:
>>
>>> Hi,
>>>
>>> When trying to setup an Erlang cluster I was getting following error
>>> while spawning a function on a remote node:
>>>
>>> =ERROR REPORT==== 13-Jan-2011::01:00:38 ===
>>> ** Can not start hello_world:ping,[] on 'node3@REDACTED' **
>>>
>>> After some investigation it turned out that nodes did not accept TCP
>>> connections, because setting SO_PRIORITY socket option failed. Strace
>>> follows:
>>
>> What kind of node (kernel version, extra options etc) do you have?
>
> Sorry for a late reply, but I was trying to investigate what
> configuration option is responsible for this behavior and how to
> reproduce it on a standard Linux box. Unfortunately without any
> success. The problem here is that socket gets created with default
> priority larger then user running the process can
> set with setsockopt, so the sequence of getsockopt(SO_PRIORITY,
> &priority), setsockopt(SO_PRIORITY, priority)  fails.
>
> I was thinking that maybe some firewall rule increasing TOS of packets
> directed and coming from a given TCP port, or some traffic shaping
> rule ('tc' command) can have an effect of changing default priority of
> sockets associated with the port. It does not seem to be the case.
> Maybe someone on this list knows if default priority of Linux sockets
> can be somehow altered?
>
> One scenario in which the sequence of getsocopt(), setsockopt() can
> fail is when socket was created by a different OS process that had
> CAP_NET_ADMIN capability. Socket descriptor can be then passed to
> Erlang VM running without CAP_NET_ADMIN, and used in 'listen {fd, Fd}'
> function, causing similar error. But this is definitely not the case
> here.
>
>
>> The inet_driver has a workaround for SO_PRIORITY being destroyed by SO_TOS
>> settings, I think that's where this fails.
>>
>>>
>>> accept(7, {sa_family=AF_INET, sin_port=htons(51602),
>>> sin_addr=inet_addr("123.123.123.123")}, [16]) = 10
>>> fcntl(10, F_GETFL)                      = 0x2 (flags O_RDWR)
>>> fcntl(10, F_SETFL, O_RDWR|O_NONBLOCK)   = 0
>>> getsockopt(7, SOL_TCP, TCP_NODELAY, [29220483580821504], [4]) = 0
>>> getsockopt(7, SOL_SOCKET, SO_KEEPALIVE, [29220483580821504], [4]) = 0
>>> getsockopt(7, SOL_SOCKET, SO_PRIORITY, [29220483580821504], [4]) = 0
>>> getsockopt(7, SOL_IP, IP_TOS, [29220483580821504], [4]) = 0
>>> getsockopt(10, SOL_SOCKET, SO_PRIORITY, [-4294967281], [4]) = 0
>>> getsockopt(10, SOL_IP, IP_TOS, [64424509440], [4]) = 0
>>> setsockopt(10, SOL_IP, IP_TOS, [0], 4)  = 0
>>> setsockopt(10, SOL_SOCKET, SO_PRIORITY, [15], 4) = -1 EPERM (Operation
>>> not permitted)
>>> close(10)
>>>
>>> To make it working I needed to add '#undef SO_PRIORITY' to
>>> erts/emulator/drivers/common/inet_drv.c and recompile.
>>>
>>> Can errors from setsockopt(..., SO_PRIORITY) be ignored? According to
>>> the socket(7) manual, it is normal for a user not to be able to change
>>> this option ('Setting a priority outside the range 0 to 6 requires the
>>> CAP_NET_ADMIN capability.').
>>
>>
>> I think it would be OK if you checked that you got EPERM in the exact
>> copy-from-listen-socket-to-result-of-accept code and ignored the result
>> then.
>>
>> I suspect you would have to patch the function setopt_prio_tos_trick in
>> inet_drv.c like this:
>> -----------------------------
>> diff --combined erts/emulator/drivers/common/inet_drv.c
>> index 818bc63,818bc63..0000000
>> --- a/erts/emulator/drivers/common/inet_drv.c
>> +++ b/erts/emulator/drivers/common/inet_drv.c
>> @@@ -5095,6 -5095,6 +5095,9 @@@ static int setopt_prio_tos_tric
>>                                            SO_PRIORITY,
>>                                            (char *)&tmp_ival_prio,
>>                                            tmp_arg_sz_prio);
>> ++                      if (res != 0&&  sock_errno() == EPERM) {
>> ++                          res = 0;
>> ++                      }
>>                     }
>>                 }
>>             }
>> -------------------------------
>>
>> Try that and see if it fixes the problem.
>
> This fixes the problem.
>
> thanks,
> Janek
>
> ________________________________________________________________
> erlang-bugs (at) erlang.org mailing list.
> See http://www.erlang.org/faq.html
> To unsubscribe; mailto:erlang-bugs-unsubscribe@REDACTED
>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: 2-otp.inet_drv.R14A.patch
URL: <http://erlang.org/pipermail/erlang-bugs/attachments/20110117/610ab895/attachment.ksh>


More information about the erlang-bugs mailing list