[erlang-questions] DNS is slow when run from many processes

Witold Baryluk <>
Tue Feb 8 18:21:05 CET 2011


On 02-08 17:56, Witold Baryluk wrote:
> On 02-08 01:50, ori brost wrote:
> > I've written a small program to demonstrate this:
> > 
> > What are my possibilities for better DNS? I now that I can use erlang
> > dns instead of native dns, this solves the problem for 127.0.0.1, but
> > when I try a real address (i.e. run(50000,"some.server.of.mine.com"))
> > connections are very slow with both native and erlang DNS.
> > 
> > Any advice on a solution?
> 

....

> Erlang dns resolver performs caching, it helps a lot, but have small cache by default.
> You can increase its size by adding {cache_size, 10000}. to erl_inetrc.
> (AFAIK if erlang uses native glibc resolver, it do not uses own cache,
> and leavs this to nscd/unscd or ther mechanisms in glibc or used servers.
> It is only used when erlang by itself connects to dns server,
> and not via glibc/nscd).
> 

...
....

> If this will be still too slow, add Erlang application side cache to this
> of some kind.
> ETS table with some garbage collection or some kind of LRU cache,
> will be probably sufficient. There are many possible and/or existing
> solutions for caching in erlang.

Last issues to know.

You can of course go into some scalabilit issues
with Erlang or own cache, especially if you are performing
lots of requests for the same name from lots of different
processes, just like you example (question how representative it is?).

I'm not sure erlang resolver detects that, and performs only
one request and rest waits for it to be available in cache (as it will be).
Without this second call will disocver that there is no entry in cache,
and start own request to nameserver. Similary next calls.
We can call it race condition.
This can drastically increase network usage.

For example this (with erlang dns resolver):

   [ spawn(fun() -> inet_res:gethostbyname("www.gazeta.pl"), ok end) || X <- lists:seq(1,100) ], ok.

will lead to over 100 request to local nameserver (i have {lookup, [dns]}, and {nameserver, {127,0,0,1}}. ).

(how bad it will be depends actually if nameserver is on the same machine,
and if server/cache software it runs is clever enaugh to not do
same mistake as erlang does).

If you will perform the same code above one more time, it will generate 0 requests.

(This is of course becuase responses, even from locla server, especially
if local server doesn't have cached them yet, comes with considerable
latency, and all inet_res:gthostbyname are executed befor any answer arives,
so no erlang cachs entries are populated to stop them from execuing actuall request).


This can be solved by fixing erlang, or fixing you application side cache, becuase
most often such queries originate from some structure of your queries and connections.
Then simply have a two levels of caches. main being a erlang dns cache, and second
being a lots of small caches, each dedicated to the some group of processes which
performs releated tasks. For example as they crawl same subdomain, or retrivies
all content (images, scripts), from single webpage. After all processes in group
are done (you hae downloaded page and all it's images and scripts),
you can discard a cache (simple dict, or gb_trese will probably suffice),
by terminating this cache process. This is much more scalable solution.

You can also enter some dark areas like scalability of ETS,
or system limits of source udp ports and sockets. This is beyond scope
of this problem, and is well explained in many other places.

Thats all. :)

-- 
Witold Baryluk
JID: witold.baryluk // jabster.pl
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 198 bytes
Desc: Digital signature
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20110208/1773930b/attachment.bin>


More information about the erlang-questions mailing list