[erlang-bugs] possibly incorrect search order in inet:gethostbyname_tm/4

Chaos Wang chaoslawful@REDACTED
Tue Jan 19 17:22:08 CET 2010


Cool~

Sorry for responding so late. I was digging into some glibc source code...

The followings are my findings (IPv4 only on Linux). And I totally agree 
with you, that the safest form to be considered as a IPv4 address will 
be the standard IPv4 dotted-decimal notation without trailing dot.

All libc APIs related to parsing IPv4 address string into in_addr form 
in my mind are:
    * inet_addr() (deprecated)
    * inet_aton()
    * inet_pton()
    * gethostbyname() and gethostbyname_r() (obsolete, but used by 
inet_gethost program)
    * gethostbyname2() and gethostbyname2_r() (GNU extension)
    * getaddrinfo()

In all these functions, strings with trailing dot will not be considered 
as IPv4 addresses.

inet_aton() (and deprecated inet_addr()) recognize IPv4 numbers-and-dots 
notation: every dotted number in the address can be in decimal, octal or 
hexadecimal. And the address can also be written in shorthand:

    a - means treat a as 32 bits
    a.b - means treat b as 24 bits
    a.b.c - means treat c as 16 bits

inet_pton() is like inet_aton(), but without all the hexadecimal, octal 
(with the exception of 0) and shorthand. So it only recognizes standard 
IPv4 dotted-decimal notation.

gethostbyname() (also gethostbyname2() and *_r alternations) use 
__nss_hostname_digits_dots() to identify IP address. This function calls 
inet_aton() to parse IPv4 address, except that it refuse to accept any 
non-digit characters. So the hexadecimal from of IPv4 addresses can't be 
recognized by it.

getaddrinfo() use inet_aton() to recognize IPv4 address. So they are 
equivalent in IPv4 address parsing.

The program I used to test these APIs is in the attachments.

Reference locations (in glibc-2.9):
    * resolv/inet_addr.c implements inet_aton(), inet_addr()
    * resolv/inet_pton.c implements inet_pton()
    * sysdeps/posix/getaddrinfo.c implements getaddrinfo()
    * nss/getXXbyYY.c implements gethostbyname*()
    * nss/getXXbyYY_r.c implements gethostbyname*_r()
    * nss/digits_dots.c implements __nss_hostname_digits_dots()

Raimo Niskanen wrote:
> I have done some research on my own...
>
> These are the ones that succeed (and other numbers
> within the ranges, of course):
>
> Linux, FreeBSD, Solaris:
> 		AF_INET
> "127.0.0.1"	->	127.0.0.1
> "192.168.1	->	192.168.0.1
> "10.1"		->	10.0.0.1
> "17"		->	0.0.0.17
> "192.168.65535"	->	192.168.255.255
> "10.16777215"	->	10.255.255.255
> "4294967295"	->	255.255.255.255
> 		AF_INET6
> "127.0.0.1"	->	::ffff:127.0.0.1
> "192.168.1	->	::ffff:192.168.0.1
> "10.1"		->	::ffff:10.0.0.1
> "17"		->	::ffff:0.0.0.17
> "192.168.65535"	->	::ffff:192.168.255.255
> "10.16777215"	->	::ffff:10.255.255.255
> "4294967295"	->	::ffff:255.255.255.255
> "::127.0.0.1"	->	::127.0.0.1
> "::"		->	::
>
> FreeBSD (addendum):
> 		AF_INET
> "127.0.0.1."	->	127.0.0.1
>
> OpenBSD:
> 		AF_INET
> "127.0.0.1"	->	127.0.0.1
> "127.0.0.1."	->	127.0.0.1
> 		AF_INET6
> "::127.0.0.1"	->	::127.0.0.1
> "::"		->	::
>
> For IPv6 addresses there seems to be consensus: if it
> parses as an IPv6 address according to the specifications
> I recall, that IPv6 address is returned, except that OpenBSD
> does not accept an IPv4 string when requesting an IPv6
> address while the others do.
>
> For IPv4 addresses Linux, FreeBSD and Solaris regards
> many numeric strings as IPv4 addresses while OpenBSD
> requires a 4-field dotted decimal. Both BSDs accept
> a trailing dot for 4-field dotted decimal, while
> Linux and Solaris regard a trailing dot as proof
> that the string is an absolute hostname.
>
> Conclusions:
>
> The least common denominator (and the most common case)
> would be to regard 4-field dotted decimal [0..255]
> with no trailing dot as an IPv4 string.
>
> The most widespread behaviour would be the Linux, Solaris
> and FreeBSD (except for trailing dot) behaviour. And since
> OpenBSD is not the origin of Erlang/OTP and has little
> importance in the community, it would probably be
> the most sensible behaviour.
>
> The current inet_parse:ipv4_address/1 needs to be
> augumented to handle the "192.168.65535",
> "10.16777215" and "4294967295" IPv4 strings.
>
> I'll toss the suggestion around internally and see if and when
> we can make such a change into the Linux/Solaris behaviour...
>
> / Raimo
>
>
>
> On Mon, Jan 18, 2010 at 04:07:24PM +0100, Raimo Niskanen wrote:
>   
>> On Mon, Jan 18, 2010 at 10:57:53AM +0800, Chaos Wang wrote:
>>     
>>> Hi all,
>>>
>>> inet:gethostbyname_tm/4 always try any specified DNS resolution methods 
>>> first, and check whether the given domain name is a IPv4/v6 address when 
>>> all previous tries failed. So even a string containing valid IP address 
>>> is specified as domain name to be resolved, it still needs to traverse 
>>> all the resolution methods before finding out it's already an IP address 
>>> at last.
>>>
>>> This would cause serious problems if 'dns' resolution method is 
>>> specified in some corporation internal networks, in which all unknown 
>>> domain names (including those treated-as-domain IPv4/v6 address string) 
>>> will be resolved into the same portal server address. Only 'native' 
>>> resolution method can be used in such an environment, because libc DNS 
>>> resolving API will check whether the domain name is an IP address at first.
>>>
>>> For example, in my working network, the resolving results when specified 
>>> {lookup,[native]} in kernel inetrc are as following:
>>>
>>>    > inet:getaddr("www.google.com", inet).    % real domain name, 
>>> resolvable at DNS server
>>>    {ok,{64,233,189,99}}
>>>    > inet:getaddr("10.0.0.2", inet).          % treated-as-domain IP 
>>> address, not resolvable at DNS server
>>>    {ok,{10,0,0,2}}
>>>
>>> But when specified {lookup,[dns]} in kernel inetrc, the results became:
>>>
>>>    > inet:getaddr("www.google.com", inet).    % real domain name, 
>>> resolvable at DNS server
>>>    {ok,{64,233,189,99}}
>>>    > inet:getaddr("10.0.0.2", inet).          % treated-as-domain IP 
>>> address, resolved to portal server address by DNS server
>>>    {ok,{115,124,17,136}}   % Oops...
>>>
>>> IMHO the search order in inet:gethostbyname_tm/4 should be changed to: 
>>> checking whether the domain name is already a IP address firstly, then 
>>> try all specified domain resolution methods.
>>>
>>> Thanks!
>>>       
>> Hi!
>>
>> You make a good case for changing the resolving order. I am almost
>> on your side, there is just one little detail...:
>>
>> Historically, portal server fake IP addresses has not been an issue
>> for inet_res (the DNS resolver). Instead, it has had to balance between
>> the RFCs and what actually is done in product networks.
>>
>> It is not impossible for inet_res to be in an environment where
>> the default domain is foo.bar and a lookup for "17" is supposed
>> to return the IP address for the host 17.foo.bar. Now "17" is
>> not a DNS label according to RFC 1035 section 2.3.1 but that
>> is only a "Preferred name syntax".
>>
>> Today it is more unlikely. But the question still is; 
>> when can you safely assume the lookup string at hand is
>> an IP address and not a host name.
>>
>> The existing function inet_parse:ipv4_address is probably
>> too forgiving since it translates "17" -> {0,0,0,17},
>> "17.18" -> {17,0,0,18}, "17.18.19" -> {17,18,0,19}
>> and "17.18.19.20" -> {17,18,19,20}, all from ancient
>> praxis or even standards.
>>
>> IPv6 addresses are more clear cut since any IPv6 address must contain
>> at least two colons and that is very unlikely for a host name.
>>
>> Can you strengthen your case by finding out more what it takes for
>> libc DNS to be convinced the lookup string is an IPv4 address? 
>>
>>     
>>> chaoslawful
>>>
>>>
>>> ________________________________________________________________
>>> erlang-bugs mailing list. See http://www.erlang.org/faq.html
>>> erlang-bugs (at) erlang.org
>>>       
>> -- 
>>
>> / Raimo Niskanen, Erlang/OTP, Ericsson AB
>>
>> ________________________________________________________________
>> erlang-bugs mailing list. See http://www.erlang.org/faq.html
>> erlang-bugs (at) erlang.org
>>     
>
>   

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-bugs/attachments/20100120/06cebc26/attachment.htm>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: ip.c
URL: <http://erlang.org/pipermail/erlang-bugs/attachments/20100120/06cebc26/attachment.c>


More information about the erlang-bugs mailing list