[erlang-questions] DNS is slow when run from many processes

Witold Baryluk <>
Tue Feb 8 17:56:10 CET 2011


On 02-08 01:50, ori brost wrote:
> I've written a small program to demonstrate this:
> 
> ---BEGIN PROGRAM---
> -module(ghbm_bug_test).
> -compile(export_all).
> 
> run(PortCount,Url) ->
>        run(2000,PortCount,Url).
> 
> run(Port,PortCount,Url) ->
>        case Port < PortCount of
>                true ->
>                        spawn(fun() -> loop(Port,Url) end),
>                        run(Port + 1,PortCount,Url);
>                false ->
>                        ok
>        end.
> 
> loop(Port,Url) ->
>        case Port of
>                2003 -> io:format("Doing connect\n");
>                _    -> ok
>        end,
>        Sock = gen_tcp:connect(Url, 8080, [], 5000),
>        case Port of
>                2003 -> io:format("Did connect\n");
>                _    -> ok
>        end,
>        case (catch gen_tcp:close(Sock)) of _ -> ok end,
>        loop(Port,Url).
> ---END PROGRAM---
> 
> 
> This program takes a lot of time to connect when connecting via
> run(50000,"127.0.0.1") and yet connects quickly when I do run(50000,
> {127,0,0,1}). I am running it with a max number of processes of
> 1000000 (set via +P).
> 
> After checking the states of processes I see that many of them spend
> time in:
> 3> i(0,200,0).
> [{current_function,{inet_gethost_native,getit,2}},
> 
> 
> What are my possibilities for better DNS? I now that I can use erlang
> dns instead of native dns, this solves the problem for 127.0.0.1, but
> when I try a real address (i.e. run(50000,"some.server.of.mine.com"))
> connections are very slow with both native and erlang DNS.
> 
> Any advice on a solution?

I will give some tips. Please folow them in order, as
most simple tricks are given here first and should solve most of the problems.

make sure '127.0.0.1 localhost' entry is in /etc/hosts, and /etc/nsswitch.conf
isn't to complicated for 'hosts:' entry, preferably something like 'hosts: dns' should be all there).

make also sure there is no "domain" or "search" or "options" in /etc/resolv.conf,
(eventually "option inet6" can be accepted)
and /etc/gai.conf is unchanged (in case you are using ipv6).

If needed use NSCD (or better unscd, which have improved
cache and multithreading and fault-tolerance), in front of libc. Tune its cache size if needed.
(libc by itself do not cache answers, but 127.0.0.1 in /etc/hosts
(which is cache AFAIK until next change), should be fast).

If you are going to perform lots of dns lookups of remote/Internet
hosts, I suggest you should install DNS rescursive server with cache,
(there are lots of good servers) on local machine (or at least on local network).
And set 'nameserver 127.0.0.1' or 'nameserver ::1' in /etc/resolv.conf

If this will still be too slow, you can skip libc/nscd layer,
by makeing Erlang connect directly to nameserver (preferably on same machine),
and not use native resolver.
See "ERTS User's Guide, Inet configuration". http://www.erlang.org/doc/apps/erts/inet_cfg.html
Current configuration can be retrived by inet:get_rc().

You will basically need to put, for example:

   clear_hosts.
   clear_ns.
   clear_search.
   {resolv_conf, "/dev/null"}.    % or any empty file, or just separate resolv.conf for erlang
   {hosts_file, "/dev/null"}.     % or any empty file, or some file with few needed entries
   {nameserver, {127,0,0,1}}.
   %% or {nameserver, {127,0,0,1}, 53}., you can add more nameserver for fault-tolerance
   {lookup, [dns]}.
   {cache_size, 10000}.

into some file (./somewhere/erl_inetrc), and perform, something like

   export ERL_INETRC=./somehwere/erl_inetrc

before starting erlang virtual machine.

(actuall even just {lookup, [file]}. should sufice, as erlang will
autodetect nameserver from /etc/resolv.conf and entries from /etc/hosts,
and even monitor this files for changes in run-time).

Erlang dns resolver performs caching, it helps a lot, but have small cache by default.
You can increase its size by adding {cache_size, 10000}. to erl_inetrc.
(AFAIK if erlang uses native glibc resolver, it do not uses own cache,
and leavs this to nscd/unscd or ther mechanisms in glibc or used servers.
It is only used when erlang by itself connects to dns server,
and not via glibc/nscd).

Unfortunetly this cache do not cache negative answers, and
probably also do not respect properly TTL values of RRs. And this is only real
problem with it, as it works very nicely, and is well designed.


This should help, and is very simple to configure in 15 minutes.
So lets try.

If this is still not sufficient, then you have bug somewhere,
or you have realy strange kind of workload. Take look below, for further tips.


If this will be still too slow, add Erlang application side cache to this
of some kind.
ETS table with some garbage collection or some kind of LRU cache,
will be probably sufficient. There are many possible and/or existing
solutions for caching in erlang.


And lastly.

If you would want even some more speed, you can run whole DNS server INSIDE erlang,
(of course written in erlang), which will perform whole recursive resolving
by itself from the root server down to destination and cache everything.
(You can also prefill cache with known servers for all top level domains,
as they changes rearly, there are only about 5000 of them, and will
help a lot - you can use the same trick for locally installed DNS server/cache also).
Actually it will not be a server, just a fully recursive resolver,
becuase glibc and erlang aren't fully recursive.

It unfortunetly will be slightly complex and I am not aware
of any DNS server in erlang (beyond my own project which is in alpha stage,
and not usefull for anything).
I also do not think it will provide significant speedup
for any practical workloads, over the already mentioned tips
(caching server on local machine and dns lookup by erlang itself).
It can eventually save some memory (which is good for caches),
and slightly improve latency (by elimniating network traffic over loopback,
and context switches). But we already talking about microseconds
here, and it do not matter when actuall network traffic is in miliseconds
range. (of course if you are not performing UDP send-only DDoS).
It also do not helps, as erlang dns resolver already performs caching.


Regards,
Witek

-- 
Witold Baryluk
JID: witold.baryluk // jabster.pl


More information about the erlang-questions mailing list