[erlang-patches] [PATCH] Fix SCTP multihoming for IPv6

Raimo Niskanen raimo+erlang-patches@REDACTED
Mon May 7 15:17:49 CEST 2012


On Sat, May 05, 2012 at 08:02:52PM +0200, Tomas Abrahamsson wrote:
> On Tue, Apr 24, 2012 at 16:11, Raimo Niskanen
> <raimo+erlang-patches@REDACTED> wrote:
> > On Mon, Apr 23, 2012 at 11:21:08PM +0200, Tomas Abrahamsson wrote:
> >> On Mon, Apr 23, 2012 at 16:16 +02:00, Raimo Niskanen wrote:
> >> > Is it so that the sctp_bindx argument 2 declared as (struct sockaddr *)
> >> > is in fact misused to be able to contain IPv6 addresses by packing
> >> > them unaligned?
> >> >
> >> > If so, I would like this fact more visible in the code by defining 'addrs' as:
> >> >        char addrs[256 * sizeof(struct inet_address)]
> >
> > But before that the Linux man page says:
> >        If sd is an IPv4 socket, the addresses passed must be IPv4 addresses.
> >        If sd is an IPv6 socket, the addresses passed can be either IPv4 or
> >        IPv6 addresses.
> >
> > So it might be possible to mix them... And if that is allowed the only way
> > I can see is packing them in a char array.
> >
> > If it is not allowed to pack them in a char array, the sctp_bindx call
> > should be able to detect mixed addresses immediately on the first
> > differing address.
> 
> I've done some experimenting.  It seems I was a bit too
> early in claiming that multihoming works with the patch.
> Should have tested other OSes.  There are several issues.

You got me worried, I thought I scared you off.
Very good findings from you below...

> 
> I have tested Linux, Solaris 11 (OpenIndiana 151a3)
> and to some extent also FreeBSD 8.2.

That should cover what is important.

> 
> On Solaris and FreeBSD, one must call bind before calling
> sctp_bindx.  On Linux it is required according to the man
> page, but in practice, it seems to work also with no call to
> bind at all.  This means ipv4 multihoming currently does not
> work on Solaris and FreeBSD, with or without any patch.
> I've verified that on OpenIndiana and FreeBSD 8.2.
> 
> Additionally, FreeBSD accepts only one address at a time in
> the sctp_bindx call.  Quoting the man page: "The argument
> addrs is a list of addresses (where the list may be only 1
> in length)".
> 
> On Solaris, mixing ipv4 addresses and ipv6 adresses in the
> call to sctp_bindx is not allowed according to the man page.
> Specifying ipv4-mapped ipv6 addresses is allowed, though.
> But it seems to half-work in practice: sctp_bindx does not
> return an error when adding an ipv4 address to an ipv6 sctp
> socket, but on the other hand, it does not seem to work to
> connect to it over ipv4 and send sctp messages.  The client
> thinks it successfully connects, but the server does not see
> anything.  Same symptoms both when specifying an ipv4
> address and an ipv4-mapped ipv6 address to sctp_bindx:
> can't get ipv4 sctp communication to work.

I would expect such problems on FreeBSD since you have to
use a rc.conf setting: ipv6_ipv4mapping="YES" that sets
a sysctl flag net.inet6.ip6.v6only: 0
or the OS will not allow IPv6 sockets to handle IPv4 traffic.
They claim there are too many security problems in that.

So it would not surprise me if Solaris does something similar.

> 
> On Linux, mixing ipv4 and ipv6 addresses seems to work in
> the call to sctp_bindx seems to work well.  It is possible
> to connect and send messages also over ipv4 to an ipv6
> socket with ipv4 addresses added to it, and even to connect
> over ipv4 to an ipv6 socket with ipv4-mapped ipv6 addresses
> added to it.  (Tested internally within one host, though, not
> between hosts.)
> 
> So to make multihoming work also on Solaris and FreeBSD, we
> would need to make it call bind before calling
> sctp_bindx, and call sctp_bindx repeatedly with only one
> address at a time.

Very good. Then it will work as good as the OS can do.

> 
> About calling bind before sctp_bindx, I guess the proper
> place to change would be to make gen_sctp:open call bind on
> the first ip address, and sctp_bindx any following ip
> addresses.  Currenty, it calls sctp_bindx (only), if more
> than one ip address is specified, otherwise it calls bind.
> The decision to call bind or sctp_bindx happens in
> inet:open/8.
> 
>    ~ ~ ~
> 
> If we would like to go the route of allowing mixing ipv4 and
> ipv6 addresses in the sctp_bindx call, then we would also
> need to change the inet_drv.c and prim_inet.erl, so that the
> SCTP_REQ_BINDX and INET_REQ_BIND messages also carry some
> kind of indication saying if the address(es) is/are ipv4 or
> ipv6 address(es).  Currently, the prim_inet.erl sends 2
> bytes port number followed by 4 or 16 bytes of address, and
> inet_drv.c expects 4 or 16 byte addresses, depending on the
> socket's/port's family (ipv4 or ipv6).

There is a function inet_set_faddress in inet_drv.c for this.

In prim_inet you might reuse the functions type_value(set, [addr], Addr)
that returns true or false depending on if Addr will be encodable by
enc_value(set, [addr], Addr).

> 
> With the patch previously submitted, specifying for instance
> two ipv6 addresses and one ipv4 address to gen_sctp:open,
> will cause a badarg, because inet_set_address in inet_drv.c
> will not find enough bytes to unpack, thus returning einval.
> Specifying 3 ipv4 addresses (3 * (2+4) = 18 bytes) for an
> ipv6 sctp socket (expected number of bytes = 2+16) might not
> cause an error, but unexpected surprises instead.  Ouch :(
> 
>    ~ ~ ~
> 
> Regarding patching, I suppose it ought to be split into
> something like 3 commits: one fixing the bind + sctp_bindx
> issue, another adding support for mixing ipv4 and ipv6
> addresses and the last commit updating the preloaded
> prim_inet (never done that before, so please forgive any
> error in reasoning here).  The first commit would have a
> value in its own, since it would add multihoming support for
> non-Linux OSes, while also adding ipv6 multihoming support
> for Linux.  Does this sound ok?

Yes.

I would say is clearer to have the updated prim_inet.beam file
in the second commit too and no third commit so a git bisect
will not find any intermediate non-working commit.

> 
> 
> BRs
> Tomas
> _______________________________________________
> erlang-patches mailing list
> erlang-patches@REDACTED
> http://erlang.org/mailman/listinfo/erlang-patches

-- 

/ Raimo Niskanen, Erlang/OTP, Ericsson AB



More information about the erlang-patches mailing list