[erlang-bugs] possibly incorrect search order in inet:gethostbyname_tm/4

Raimo Niskanen raimo+erlang-bugs@REDACTED
Wed Jan 20 12:23:22 CET 2010


On Wed, Jan 20, 2010 at 12:22:08AM +0800, Chaos Wang wrote:
> Cool~
> 
> Sorry for responding so late. I was digging into some glibc source code...
> 
> The followings are my findings (IPv4 only on Linux). And I totally agree 
> with you, that the safest form to be considered as a IPv4 address will 
> be the standard IPv4 dotted-decimal notation without trailing dot.

But I think the best way would be to adopt the Solaris/Linux behaviour.
Perhaps even also the hex/octal notation...

> 
> All libc APIs related to parsing IPv4 address string into in_addr form 
> in my mind are:
>    * inet_addr() (deprecated)
>    * inet_aton()
>    * inet_pton()
>    * gethostbyname() and gethostbyname_r() (obsolete, but used by 
> inet_gethost program)
>    * gethostbyname2() and gethostbyname2_r() (GNU extension)
>    * getaddrinfo()
> 
> In all these functions, strings with trailing dot will not be considered 
> as IPv4 addresses.
> 
> inet_aton() (and deprecated inet_addr()) recognize IPv4 numbers-and-dots 
> notation: every dotted number in the address can be in decimal, octal or 
> hexadecimal. And the address can also be written in shorthand:
> 
>    a - means treat a as 32 bits
>    a.b - means treat b as 24 bits
>    a.b.c - means treat c as 16 bits
> 
> inet_pton() is like inet_aton(), but without all the hexadecimal, octal 
> (with the exception of 0) and shorthand. So it only recognizes standard 
> IPv4 dotted-decimal notation.
> 
> gethostbyname() (also gethostbyname2() and *_r alternations) use 
> __nss_hostname_digits_dots() to identify IP address. This function calls 
> inet_aton() to parse IPv4 address, except that it refuse to accept any 
> non-digit characters. So the hexadecimal from of IPv4 addresses can't be 
> recognized by it.
> 
> getaddrinfo() use inet_aton() to recognize IPv4 address. So they are 
> equivalent in IPv4 address parsing.

I just found out you have almost created the Solaris man page for inet_pton:
http://www.s-gms.ms.edus.si/cgi-bin/man-cgi -> search command inet_pton

> 
> The program I used to test these APIs is in the attachments.
> 
> Reference locations (in glibc-2.9):
>    * resolv/inet_addr.c implements inet_aton(), inet_addr()
>    * resolv/inet_pton.c implements inet_pton()
>    * sysdeps/posix/getaddrinfo.c implements getaddrinfo()
>    * nss/getXXbyYY.c implements gethostbyname*()
>    * nss/getXXbyYY_r.c implements gethostbyname*_r()
>    * nss/digits_dots.c implements __nss_hostname_digits_dots()
> 
> Raimo Niskanen wrote:
> >I have done some research on my own...
> >
> >These are the ones that succeed (and other numbers
> >within the ranges, of course):
> >
> >Linux, FreeBSD, Solaris:
> >		AF_INET
> >"127.0.0.1"	->	127.0.0.1
> >"192.168.1	->	192.168.0.1
> >"10.1"		->	10.0.0.1
> >"17"		->	0.0.0.17
> >"192.168.65535"	->	192.168.255.255
> >"10.16777215"	->	10.255.255.255
> >"4294967295"	->	255.255.255.255
> >		AF_INET6
> >"127.0.0.1"	->	::ffff:127.0.0.1
> >"192.168.1	->	::ffff:192.168.0.1
> >"10.1"		->	::ffff:10.0.0.1
> >"17"		->	::ffff:0.0.0.17
> >"192.168.65535"	->	::ffff:192.168.255.255
> >"10.16777215"	->	::ffff:10.255.255.255
> >"4294967295"	->	::ffff:255.255.255.255
> >"::127.0.0.1"	->	::127.0.0.1
> >"::"		->	::
> >
> >FreeBSD (addendum):
> >		AF_INET
> >"127.0.0.1."	->	127.0.0.1
> >
> >OpenBSD:
> >		AF_INET
> >"127.0.0.1"	->	127.0.0.1
> >"127.0.0.1."	->	127.0.0.1
> >		AF_INET6
> >"::127.0.0.1"	->	::127.0.0.1
> >"::"		->	::
> >
> >For IPv6 addresses there seems to be consensus: if it
> >parses as an IPv6 address according to the specifications
> >I recall, that IPv6 address is returned, except that OpenBSD
> >does not accept an IPv4 string when requesting an IPv6
> >address while the others do.
> >
> >For IPv4 addresses Linux, FreeBSD and Solaris regards
> >many numeric strings as IPv4 addresses while OpenBSD
> >requires a 4-field dotted decimal. Both BSDs accept
> >a trailing dot for 4-field dotted decimal, while
> >Linux and Solaris regard a trailing dot as proof
> >that the string is an absolute hostname.
> >
> >Conclusions:
> >
> >The least common denominator (and the most common case)
> >would be to regard 4-field dotted decimal [0..255]
> >with no trailing dot as an IPv4 string.
> >
> >The most widespread behaviour would be the Linux, Solaris
> >and FreeBSD (except for trailing dot) behaviour. And since
> >OpenBSD is not the origin of Erlang/OTP and has little
> >importance in the community, it would probably be
> >the most sensible behaviour.
> >
> >The current inet_parse:ipv4_address/1 needs to be
> >augumented to handle the "192.168.65535",
> >"10.16777215" and "4294967295" IPv4 strings.
> >
> >I'll toss the suggestion around internally and see if and when
> >we can make such a change into the Linux/Solaris behaviour...
> >
> >/ Raimo
> >
> >
> >
> >On Mon, Jan 18, 2010 at 04:07:24PM +0100, Raimo Niskanen wrote:
> >  
> >>On Mon, Jan 18, 2010 at 10:57:53AM +0800, Chaos Wang wrote:
> >>    
> >>>Hi all,
> >>>
> >>>inet:gethostbyname_tm/4 always try any specified DNS resolution methods 
> >>>first, and check whether the given domain name is a IPv4/v6 address when 
> >>>all previous tries failed. So even a string containing valid IP address 
> >>>is specified as domain name to be resolved, it still needs to traverse 
> >>>all the resolution methods before finding out it's already an IP address 
> >>>at last.
> >>>
> >>>This would cause serious problems if 'dns' resolution method is 
> >>>specified in some corporation internal networks, in which all unknown 
> >>>domain names (including those treated-as-domain IPv4/v6 address string) 
> >>>will be resolved into the same portal server address. Only 'native' 
> >>>resolution method can be used in such an environment, because libc DNS 
> >>>resolving API will check whether the domain name is an IP address at 
> >>>first.
> >>>
> >>>For example, in my working network, the resolving results when specified 
> >>>{lookup,[native]} in kernel inetrc are as following:
> >>>
> >>>   > inet:getaddr("www.google.com", inet).    % real domain name, 
> >>>resolvable at DNS server
> >>>   {ok,{64,233,189,99}}
> >>>   > inet:getaddr("10.0.0.2", inet).          % treated-as-domain IP 
> >>>address, not resolvable at DNS server
> >>>   {ok,{10,0,0,2}}
> >>>
> >>>But when specified {lookup,[dns]} in kernel inetrc, the results became:
> >>>
> >>>   > inet:getaddr("www.google.com", inet).    % real domain name, 
> >>>resolvable at DNS server
> >>>   {ok,{64,233,189,99}}
> >>>   > inet:getaddr("10.0.0.2", inet).          % treated-as-domain IP 
> >>>address, resolved to portal server address by DNS server
> >>>   {ok,{115,124,17,136}}   % Oops...
> >>>
> >>>IMHO the search order in inet:gethostbyname_tm/4 should be changed to: 
> >>>checking whether the domain name is already a IP address firstly, then 
> >>>try all specified domain resolution methods.
> >>>
> >>>Thanks!
> >>>      
> >>Hi!
> >>
> >>You make a good case for changing the resolving order. I am almost
> >>on your side, there is just one little detail...:
> >>
> >>Historically, portal server fake IP addresses has not been an issue
> >>for inet_res (the DNS resolver). Instead, it has had to balance between
> >>the RFCs and what actually is done in product networks.
> >>
> >>It is not impossible for inet_res to be in an environment where
> >>the default domain is foo.bar and a lookup for "17" is supposed
> >>to return the IP address for the host 17.foo.bar. Now "17" is
> >>not a DNS label according to RFC 1035 section 2.3.1 but that
> >>is only a "Preferred name syntax".
> >>
> >>Today it is more unlikely. But the question still is; 
> >>when can you safely assume the lookup string at hand is
> >>an IP address and not a host name.
> >>
> >>The existing function inet_parse:ipv4_address is probably
> >>too forgiving since it translates "17" -> {0,0,0,17},
> >>"17.18" -> {17,0,0,18}, "17.18.19" -> {17,18,0,19}
> >>and "17.18.19.20" -> {17,18,19,20}, all from ancient
> >>praxis or even standards.
> >>
> >>IPv6 addresses are more clear cut since any IPv6 address must contain
> >>at least two colons and that is very unlikely for a host name.
> >>
> >>Can you strengthen your case by finding out more what it takes for
> >>libc DNS to be convinced the lookup string is an IPv4 address? 
> >>
> >>    
> >>>chaoslawful
> >>>
> >>>
> >>>________________________________________________________________
> >>>erlang-bugs mailing list. See http://www.erlang.org/faq.html
> >>>erlang-bugs (at) erlang.org
> >>>      
> >>-- 
> >>
> >>/ Raimo Niskanen, Erlang/OTP, Ericsson AB
> >>
> >>________________________________________________________________
> >>erlang-bugs mailing list. See http://www.erlang.org/faq.html
> >>erlang-bugs (at) erlang.org
> >>    
> >
> >  
> 

> #include <sys/socket.h>
> #include <netinet/in.h>
> #include <arpa/inet.h>
> #include <netdb.h>
> #include <stdio.h>
> #include <errno.h>
> #include <string.h>
> 
> void use_inet_addr(const char *name);
> void use_inet_aton(const char *name);
> void use_inet_pton(const char *name, int af);
> void use_gethostbyname(const char *name);
> void use_gethostbyname_r(const char *name);
> void use_gethostbyname2(const char *name, int af);
> void use_gethostbyname2_r(const char *name, int af);
> void use_getaddrinfo(const char *name);
> 
> int main(int argc, char *argv[])
> {
> 	if(argc != 2) {
> 		printf("Usage: %s <IPv4 addr>\n", argv[0]);
> 		return 1;
> 	}
> 
> 	use_inet_addr(argv[1]);
> 	use_inet_aton(argv[1]);
> 	use_inet_pton(argv[1], AF_INET);
> 	use_gethostbyname(argv[1]);
> 	use_gethostbyname_r(argv[1]);
> #if defined(_BSD_SOURCE) || defined(_SVID_SOURCE)
> 	use_gethostbyname2(argv[1], AF_INET);
> 	use_gethostbyname2_r(argv[1], AF_INET);
> #endif
> 	use_getaddrinfo(argv[1]);
> 
> 	return 0;
> }
> 
> void use_inet_addr(const char *name)
> {
> 	in_addr_t addr = inet_addr(name);
> 
> 	printf("inet_addr: ");
> 	if(addr == INADDR_NONE) {
> 		printf("failed (possibly 255.255.255.255)\n");
> 	} else {
> 		struct in_addr in;
> 		in.s_addr = addr;
> 		printf("%s\n", inet_ntoa(in));
> 	}
> }
> 
> void use_inet_aton(const char *name)
> {
> 	struct in_addr in;
> 
> 	printf("inet_aton: ");
> 	if(inet_aton(name, &in)) {
> 		printf("%s\n", inet_ntoa(in));
> 	} else {
> 		printf("failed\n");
> 	}
> }
> 
> void use_inet_pton(const char *name, int af)
> {
> 	struct in_addr in;
> 	
> 	printf("inet_pton: ");
> 	if(inet_pton(af, name, &in)) {
> 		printf("%s\n", inet_ntoa(in));
> 	} else {
> 		printf("failed\n");
> 	}
> }
> 
> void use_gethostbyname(const char *name)
> {
> 	struct hostent *h = gethostbyname(name);
> 
> 	printf("gethostbyname: ");
> 	if(!h) {
> 		printf("failed (%s)\n", hstrerror(h_errno));
> 	} else {
> 		if(h->h_addrtype != AF_INET) {
> 			printf("failed (invalid address type)\n");
> 		} else {
> 			char **pp = h->h_addr_list;
> 			while(*pp != NULL) {
> 				struct in_addr *p = (struct in_addr*)(*pp);
> 				printf("%s\n", inet_ntoa(*p));
> 				++pp;
> 			}
> 		}
> 	}
> }
> 
> void use_gethostbyname_r(const char *name)
> {
> 	int rc;
> 	char buf[8192];
> 	struct hostent h;
> 	struct hostent *rp;
> 	int myerrno;
> 
> 	printf("gethostbyname_r: ");
> 	rc = gethostbyname_r(name, &h, buf, sizeof(buf), &rp, &myerrno);
> 	if(rc == ERANGE) {
> 		printf("failed (out of memory)\n");
> 	} else if(!rc) {
> 		if(!rp) {
> 			printf("no address found\n");
> 		} else {
> 			char **pp = h.h_addr_list;
> 			while(*pp != NULL) {
> 				struct in_addr *p = (struct in_addr*)(*pp);
> 				printf("%s\n", inet_ntoa(*p));
> 				++pp;
> 			}
> 		}
> 	} else {
> 		printf("failed (%s)\n", hstrerror(myerrno));
> 	}
> }
> 
> #if defined(_BSD_SOURCE) || defined(_SVID_SOURCE)
> 
> void use_gethostbyname2(const char *name, int af)
> {
> 	struct hostent *h = gethostbyname2(name, af);
> 
> 	printf("gethostbyname2: ");
> 	if(!h) {
> 		printf("failed (%s)\n", hstrerror(h_errno));
> 	} else {
> 		if(h->h_addrtype != AF_INET) {
> 			printf("failed (invalid address type)\n");
> 		} else {
> 			char **pp = h->h_addr_list;
> 			while(*pp != NULL) {
> 				struct in_addr *p = (struct in_addr*)(*pp);
> 				printf("%s\n", inet_ntoa(*p));
> 				++pp;
> 			}
> 		}
> 	}
> }
> 
> void use_gethostbyname2_r(const char *name, int af)
> {
> 	int rc;
> 	char buf[8192];
> 	struct hostent h;
> 	struct hostent *rp;
> 	int myerrno;
> 
> 	printf("gethostbyname2_r: ");
> 	rc = gethostbyname2_r(name, af, &h, buf, sizeof(buf), &rp, &myerrno);
> 	if(rc == ERANGE) {
> 		printf("failed (out of memory)\n");
> 	} else if(!rc) {
> 		if(!rp) {
> 			printf("no address found\n");
> 		} else {
> 			char **pp = h.h_addr_list;
> 			while(*pp != NULL) {
> 				struct in_addr *p = (struct in_addr*)(*pp);
> 				printf("%s\n", inet_ntoa(*p));
> 				++pp;
> 			}
> 		}
> 	} else {
> 		printf("failed (%s)\n", hstrerror(myerrno));
> 	}
> }
> 
> #endif
> 
> void use_getaddrinfo(const char *name)
> {
> 	int rc;
> 	struct addrinfo hints;
> 	struct addrinfo *resp;
> 
> 	printf("getaddrinfo: ");
> 
> 	memset(&hints, 0, sizeof(hints));
> 	hints.ai_family = AF_INET;
> 	hints.ai_flags = AI_ADDRCONFIG | AI_PASSIVE;
> 	hints.ai_socktype = SOCK_STREAM;
> 	
> 	rc = getaddrinfo(name, NULL, &hints, &resp);
> 	if(rc) {
> 		printf("failed (%s)\n", gai_strerror(rc));
> 	} else {
> 		struct addrinfo *rp;
> 		
> 		for(rp = resp; rp != NULL; rp = rp->ai_next) {
> 			struct sockaddr_in *addr = (struct sockaddr_in*)(rp->ai_addr);
> 			printf("%s\n", inet_ntoa(addr->sin_addr));
> 		}
> 
> 		freeaddrinfo(resp);
> 	}
> }
> 
> 

> 
> ________________________________________________________________
> erlang-bugs mailing list. See http://www.erlang.org/faq.html
> erlang-bugs (at) erlang.org

-- 

/ Raimo Niskanen, Erlang/OTP, Ericsson AB


More information about the erlang-bugs mailing list