[erlang-bugs] possibly incorrect search order in inet:gethostbyname_tm/4
Chaos Wang
chaoslawful@REDACTED
Wed Jan 20 16:17:37 CET 2010
Raimo Niskanen wrote:
> On Wed, Jan 20, 2010 at 12:22:08AM +0800, Chaos Wang wrote:
>
>> Cool~
>>
>> Sorry for responding so late. I was digging into some glibc source code...
>>
>> The followings are my findings (IPv4 only on Linux). And I totally agree
>> with you, that the safest form to be considered as a IPv4 address will
>> be the standard IPv4 dotted-decimal notation without trailing dot.
>>
>
> But I think the best way would be to adopt the Solaris/Linux behaviour.
> Perhaps even also the hex/octal notation...
>
Either way is OK to me, as long as it can recognize addresses in
standard IPv4 dotted-decimal notation correctly ;-)
>
>> All libc APIs related to parsing IPv4 address string into in_addr form
>> in my mind are:
>> * inet_addr() (deprecated)
>> * inet_aton()
>> * inet_pton()
>> * gethostbyname() and gethostbyname_r() (obsolete, but used by
>> inet_gethost program)
>> * gethostbyname2() and gethostbyname2_r() (GNU extension)
>> * getaddrinfo()
>>
>> In all these functions, strings with trailing dot will not be considered
>> as IPv4 addresses.
>>
>> inet_aton() (and deprecated inet_addr()) recognize IPv4 numbers-and-dots
>> notation: every dotted number in the address can be in decimal, octal or
>> hexadecimal. And the address can also be written in shorthand:
>>
>> a - means treat a as 32 bits
>> a.b - means treat b as 24 bits
>> a.b.c - means treat c as 16 bits
>>
>> inet_pton() is like inet_aton(), but without all the hexadecimal, octal
>> (with the exception of 0) and shorthand. So it only recognizes standard
>> IPv4 dotted-decimal notation.
>>
>> gethostbyname() (also gethostbyname2() and *_r alternations) use
>> __nss_hostname_digits_dots() to identify IP address. This function calls
>> inet_aton() to parse IPv4 address, except that it refuse to accept any
>> non-digit characters. So the hexadecimal from of IPv4 addresses can't be
>> recognized by it.
>>
>> getaddrinfo() use inet_aton() to recognize IPv4 address. So they are
>> equivalent in IPv4 address parsing.
>>
>
> I just found out you have almost created the Solaris man page for inet_pton:
> http://www.s-gms.ms.edus.si/cgi-bin/man-cgi -> search command inet_pton
>
Ah, truly a coincidence :-)
Manpages in Solaris are more thorough than in Linux, indeed.
>
>> The program I used to test these APIs is in the attachments.
>>
>> Reference locations (in glibc-2.9):
>> * resolv/inet_addr.c implements inet_aton(), inet_addr()
>> * resolv/inet_pton.c implements inet_pton()
>> * sysdeps/posix/getaddrinfo.c implements getaddrinfo()
>> * nss/getXXbyYY.c implements gethostbyname*()
>> * nss/getXXbyYY_r.c implements gethostbyname*_r()
>> * nss/digits_dots.c implements __nss_hostname_digits_dots()
>>
>> Raimo Niskanen wrote:
>>
>>> I have done some research on my own...
>>>
>>> These are the ones that succeed (and other numbers
>>> within the ranges, of course):
>>>
>>> Linux, FreeBSD, Solaris:
>>> AF_INET
>>> "127.0.0.1" -> 127.0.0.1
>>> "192.168.1 -> 192.168.0.1
>>> "10.1" -> 10.0.0.1
>>> "17" -> 0.0.0.17
>>> "192.168.65535" -> 192.168.255.255
>>> "10.16777215" -> 10.255.255.255
>>> "4294967295" -> 255.255.255.255
>>> AF_INET6
>>> "127.0.0.1" -> ::ffff:127.0.0.1
>>> "192.168.1 -> ::ffff:192.168.0.1
>>> "10.1" -> ::ffff:10.0.0.1
>>> "17" -> ::ffff:0.0.0.17
>>> "192.168.65535" -> ::ffff:192.168.255.255
>>> "10.16777215" -> ::ffff:10.255.255.255
>>> "4294967295" -> ::ffff:255.255.255.255
>>> "::127.0.0.1" -> ::127.0.0.1
>>> "::" -> ::
>>>
>>> FreeBSD (addendum):
>>> AF_INET
>>> "127.0.0.1." -> 127.0.0.1
>>>
>>> OpenBSD:
>>> AF_INET
>>> "127.0.0.1" -> 127.0.0.1
>>> "127.0.0.1." -> 127.0.0.1
>>> AF_INET6
>>> "::127.0.0.1" -> ::127.0.0.1
>>> "::" -> ::
>>>
>>> For IPv6 addresses there seems to be consensus: if it
>>> parses as an IPv6 address according to the specifications
>>> I recall, that IPv6 address is returned, except that OpenBSD
>>> does not accept an IPv4 string when requesting an IPv6
>>> address while the others do.
>>>
>>> For IPv4 addresses Linux, FreeBSD and Solaris regards
>>> many numeric strings as IPv4 addresses while OpenBSD
>>> requires a 4-field dotted decimal. Both BSDs accept
>>> a trailing dot for 4-field dotted decimal, while
>>> Linux and Solaris regard a trailing dot as proof
>>> that the string is an absolute hostname.
>>>
>>> Conclusions:
>>>
>>> The least common denominator (and the most common case)
>>> would be to regard 4-field dotted decimal [0..255]
>>> with no trailing dot as an IPv4 string.
>>>
>>> The most widespread behaviour would be the Linux, Solaris
>>> and FreeBSD (except for trailing dot) behaviour. And since
>>> OpenBSD is not the origin of Erlang/OTP and has little
>>> importance in the community, it would probably be
>>> the most sensible behaviour.
>>>
>>> The current inet_parse:ipv4_address/1 needs to be
>>> augumented to handle the "192.168.65535",
>>> "10.16777215" and "4294967295" IPv4 strings.
>>>
>>> I'll toss the suggestion around internally and see if and when
>>> we can make such a change into the Linux/Solaris behaviour...
>>>
>>> / Raimo
>>>
>>>
>>>
>>> On Mon, Jan 18, 2010 at 04:07:24PM +0100, Raimo Niskanen wrote:
>>>
>>>
>>>> On Mon, Jan 18, 2010 at 10:57:53AM +0800, Chaos Wang wrote:
>>>>
>>>>
>>>>> Hi all,
>>>>>
>>>>> inet:gethostbyname_tm/4 always try any specified DNS resolution methods
>>>>> first, and check whether the given domain name is a IPv4/v6 address when
>>>>> all previous tries failed. So even a string containing valid IP address
>>>>> is specified as domain name to be resolved, it still needs to traverse
>>>>> all the resolution methods before finding out it's already an IP address
>>>>> at last.
>>>>>
>>>>> This would cause serious problems if 'dns' resolution method is
>>>>> specified in some corporation internal networks, in which all unknown
>>>>> domain names (including those treated-as-domain IPv4/v6 address string)
>>>>> will be resolved into the same portal server address. Only 'native'
>>>>> resolution method can be used in such an environment, because libc DNS
>>>>> resolving API will check whether the domain name is an IP address at
>>>>> first.
>>>>>
>>>>> For example, in my working network, the resolving results when specified
>>>>> {lookup,[native]} in kernel inetrc are as following:
>>>>>
>>>>> > inet:getaddr("www.google.com", inet). % real domain name,
>>>>> resolvable at DNS server
>>>>> {ok,{64,233,189,99}}
>>>>> > inet:getaddr("10.0.0.2", inet). % treated-as-domain IP
>>>>> address, not resolvable at DNS server
>>>>> {ok,{10,0,0,2}}
>>>>>
>>>>> But when specified {lookup,[dns]} in kernel inetrc, the results became:
>>>>>
>>>>> > inet:getaddr("www.google.com", inet). % real domain name,
>>>>> resolvable at DNS server
>>>>> {ok,{64,233,189,99}}
>>>>> > inet:getaddr("10.0.0.2", inet). % treated-as-domain IP
>>>>> address, resolved to portal server address by DNS server
>>>>> {ok,{115,124,17,136}} % Oops...
>>>>>
>>>>> IMHO the search order in inet:gethostbyname_tm/4 should be changed to:
>>>>> checking whether the domain name is already a IP address firstly, then
>>>>> try all specified domain resolution methods.
>>>>>
>>>>> Thanks!
>>>>>
>>>>>
>>>> Hi!
>>>>
>>>> You make a good case for changing the resolving order. I am almost
>>>> on your side, there is just one little detail...:
>>>>
>>>> Historically, portal server fake IP addresses has not been an issue
>>>> for inet_res (the DNS resolver). Instead, it has had to balance between
>>>> the RFCs and what actually is done in product networks.
>>>>
>>>> It is not impossible for inet_res to be in an environment where
>>>> the default domain is foo.bar and a lookup for "17" is supposed
>>>> to return the IP address for the host 17.foo.bar. Now "17" is
>>>> not a DNS label according to RFC 1035 section 2.3.1 but that
>>>> is only a "Preferred name syntax".
>>>>
>>>> Today it is more unlikely. But the question still is;
>>>> when can you safely assume the lookup string at hand is
>>>> an IP address and not a host name.
>>>>
>>>> The existing function inet_parse:ipv4_address is probably
>>>> too forgiving since it translates "17" -> {0,0,0,17},
>>>> "17.18" -> {17,0,0,18}, "17.18.19" -> {17,18,0,19}
>>>> and "17.18.19.20" -> {17,18,19,20}, all from ancient
>>>> praxis or even standards.
>>>>
>>>> IPv6 addresses are more clear cut since any IPv6 address must contain
>>>> at least two colons and that is very unlikely for a host name.
>>>>
>>>> Can you strengthen your case by finding out more what it takes for
>>>> libc DNS to be convinced the lookup string is an IPv4 address?
>>>>
>>>>
>>>>
>>>>> chaoslawful
>>>>>
>>>>>
>>>>> ________________________________________________________________
>>>>> erlang-bugs mailing list. See http://www.erlang.org/faq.html
>>>>> erlang-bugs (at) erlang.org
>>>>>
>>>>>
>>>> --
>>>>
>>>> / Raimo Niskanen, Erlang/OTP, Ericsson AB
>>>>
>>>> ________________________________________________________________
>>>> erlang-bugs mailing list. See http://www.erlang.org/faq.html
>>>> erlang-bugs (at) erlang.org
>>>>
>>>>
>>>
>>>
>
>
>> #include <sys/socket.h>
>> #include <netinet/in.h>
>> #include <arpa/inet.h>
>> #include <netdb.h>
>> #include <stdio.h>
>> #include <errno.h>
>> #include <string.h>
>>
>> void use_inet_addr(const char *name);
>> void use_inet_aton(const char *name);
>> void use_inet_pton(const char *name, int af);
>> void use_gethostbyname(const char *name);
>> void use_gethostbyname_r(const char *name);
>> void use_gethostbyname2(const char *name, int af);
>> void use_gethostbyname2_r(const char *name, int af);
>> void use_getaddrinfo(const char *name);
>>
>> int main(int argc, char *argv[])
>> {
>> if(argc != 2) {
>> printf("Usage: %s <IPv4 addr>\n", argv[0]);
>> return 1;
>> }
>>
>> use_inet_addr(argv[1]);
>> use_inet_aton(argv[1]);
>> use_inet_pton(argv[1], AF_INET);
>> use_gethostbyname(argv[1]);
>> use_gethostbyname_r(argv[1]);
>> #if defined(_BSD_SOURCE) || defined(_SVID_SOURCE)
>> use_gethostbyname2(argv[1], AF_INET);
>> use_gethostbyname2_r(argv[1], AF_INET);
>> #endif
>> use_getaddrinfo(argv[1]);
>>
>> return 0;
>> }
>>
>> void use_inet_addr(const char *name)
>> {
>> in_addr_t addr = inet_addr(name);
>>
>> printf("inet_addr: ");
>> if(addr == INADDR_NONE) {
>> printf("failed (possibly 255.255.255.255)\n");
>> } else {
>> struct in_addr in;
>> in.s_addr = addr;
>> printf("%s\n", inet_ntoa(in));
>> }
>> }
>>
>> void use_inet_aton(const char *name)
>> {
>> struct in_addr in;
>>
>> printf("inet_aton: ");
>> if(inet_aton(name, &in)) {
>> printf("%s\n", inet_ntoa(in));
>> } else {
>> printf("failed\n");
>> }
>> }
>>
>> void use_inet_pton(const char *name, int af)
>> {
>> struct in_addr in;
>>
>> printf("inet_pton: ");
>> if(inet_pton(af, name, &in)) {
>> printf("%s\n", inet_ntoa(in));
>> } else {
>> printf("failed\n");
>> }
>> }
>>
>> void use_gethostbyname(const char *name)
>> {
>> struct hostent *h = gethostbyname(name);
>>
>> printf("gethostbyname: ");
>> if(!h) {
>> printf("failed (%s)\n", hstrerror(h_errno));
>> } else {
>> if(h->h_addrtype != AF_INET) {
>> printf("failed (invalid address type)\n");
>> } else {
>> char **pp = h->h_addr_list;
>> while(*pp != NULL) {
>> struct in_addr *p = (struct in_addr*)(*pp);
>> printf("%s\n", inet_ntoa(*p));
>> ++pp;
>> }
>> }
>> }
>> }
>>
>> void use_gethostbyname_r(const char *name)
>> {
>> int rc;
>> char buf[8192];
>> struct hostent h;
>> struct hostent *rp;
>> int myerrno;
>>
>> printf("gethostbyname_r: ");
>> rc = gethostbyname_r(name, &h, buf, sizeof(buf), &rp, &myerrno);
>> if(rc == ERANGE) {
>> printf("failed (out of memory)\n");
>> } else if(!rc) {
>> if(!rp) {
>> printf("no address found\n");
>> } else {
>> char **pp = h.h_addr_list;
>> while(*pp != NULL) {
>> struct in_addr *p = (struct in_addr*)(*pp);
>> printf("%s\n", inet_ntoa(*p));
>> ++pp;
>> }
>> }
>> } else {
>> printf("failed (%s)\n", hstrerror(myerrno));
>> }
>> }
>>
>> #if defined(_BSD_SOURCE) || defined(_SVID_SOURCE)
>>
>> void use_gethostbyname2(const char *name, int af)
>> {
>> struct hostent *h = gethostbyname2(name, af);
>>
>> printf("gethostbyname2: ");
>> if(!h) {
>> printf("failed (%s)\n", hstrerror(h_errno));
>> } else {
>> if(h->h_addrtype != AF_INET) {
>> printf("failed (invalid address type)\n");
>> } else {
>> char **pp = h->h_addr_list;
>> while(*pp != NULL) {
>> struct in_addr *p = (struct in_addr*)(*pp);
>> printf("%s\n", inet_ntoa(*p));
>> ++pp;
>> }
>> }
>> }
>> }
>>
>> void use_gethostbyname2_r(const char *name, int af)
>> {
>> int rc;
>> char buf[8192];
>> struct hostent h;
>> struct hostent *rp;
>> int myerrno;
>>
>> printf("gethostbyname2_r: ");
>> rc = gethostbyname2_r(name, af, &h, buf, sizeof(buf), &rp, &myerrno);
>> if(rc == ERANGE) {
>> printf("failed (out of memory)\n");
>> } else if(!rc) {
>> if(!rp) {
>> printf("no address found\n");
>> } else {
>> char **pp = h.h_addr_list;
>> while(*pp != NULL) {
>> struct in_addr *p = (struct in_addr*)(*pp);
>> printf("%s\n", inet_ntoa(*p));
>> ++pp;
>> }
>> }
>> } else {
>> printf("failed (%s)\n", hstrerror(myerrno));
>> }
>> }
>>
>> #endif
>>
>> void use_getaddrinfo(const char *name)
>> {
>> int rc;
>> struct addrinfo hints;
>> struct addrinfo *resp;
>>
>> printf("getaddrinfo: ");
>>
>> memset(&hints, 0, sizeof(hints));
>> hints.ai_family = AF_INET;
>> hints.ai_flags = AI_ADDRCONFIG | AI_PASSIVE;
>> hints.ai_socktype = SOCK_STREAM;
>>
>> rc = getaddrinfo(name, NULL, &hints, &resp);
>> if(rc) {
>> printf("failed (%s)\n", gai_strerror(rc));
>> } else {
>> struct addrinfo *rp;
>>
>> for(rp = resp; rp != NULL; rp = rp->ai_next) {
>> struct sockaddr_in *addr = (struct sockaddr_in*)(rp->ai_addr);
>> printf("%s\n", inet_ntoa(addr->sin_addr));
>> }
>>
>> freeaddrinfo(resp);
>> }
>> }
>>
>>
>>
>
>
>> ________________________________________________________________
>> erlang-bugs mailing list. See http://www.erlang.org/faq.html
>> erlang-bugs (at) erlang.org
>>
>
>
More information about the erlang-bugs
mailing list