[erlang-bugs] R13B04 inet_res:resolve/4 inet_udp Port leak
Raimo Niskanen
raimo+erlang-bugs@REDACTED
Mon May 31 14:46:13 CEST 2010
On Thu, May 27, 2010 at 04:26:20PM +0200, Raimo Niskanen wrote:
> On Thu, May 27, 2010 at 07:11:03AM -0700, Bob Ippolito wrote:
> > I can confirm that the first commit fixes the port leak bug.
>
> Great! Then that branch should be complete. The second
> commit makes your DNS reply message below decode with
> the TC bit set, which should make inet_res retry with
> 'usevc' internally, obsoleting your wrapper (hopefully).
4aa2ead3149d3727ec6ad67b653ff51c74405671
New commit. The previous only worked for your special case.
What is not tested does not work...
It will be included in 'pu'.
>
> >
> > On Thu, May 27, 2010 at 3:02 AM, Raimo Niskanen
> > <raimo+erlang-bugs@REDACTED> wrote:
> > > I have created a fix for these problems:
> > > git fetch git://github.com/RaimoNiskanen/otp.git rn/resolver-leaking-ports
> > >
> > > It will be included in 'pu'.
> > >
> > > Unfortunately, the second commit eliminates the bug trigger
> > > for what the first commit fixes. So to test if the bug fix
> > > is fixing the bug, one should apply the first commit only.
> > >
> > > On Thu, May 27, 2010 at 09:43:54AM +0200, Raimo Niskanen wrote:
> > >> On Wed, May 26, 2010 at 11:59:26AM -0700, Bob Ippolito wrote:
> > >> > Here's the DNS packet that is being received as a response to the query:
> > >> >
> > >> > 1> inet_dns:decode(<<0,1,131,128,0,1,0,60,0,0,0,0,8,109,111,99,104,105,115,118,110,
> > >> > 3,101,114,108,10,109,111,99,104,105,109,101,100,105,97,3,110,
> > >> > 101,116,0,0,1,0,1>>).
> > >> > {error,fmt}
> > >>
> > >> Thank you very much! I was about to give you detailed instructions
> > >> about how to dig that up :-)
> > >>
> > >> I have spotted the problem.
> > >>
> > >> The DNS reply packet has got the TC (TrunCation) bit set and claims to contain
> > >> 60 answer records, but actually contains zero. inet_dns expects to find
> > >> 60 answer records if it says so. This is a hazy part of the
> > >> DNS specifications and the resolver I tested truncation on did
> > >> not do this kind of self-contradiction, but it _may_ be allowed
> > >> by the specification...
> > >>
> > >> I regard it as a bug (or at least need-to-fix-problem) in inet_dns
> > >> since it should be real-world compatible not just specification compatible.
> > >> It should allow record shortage in a section if the TC bit is set.
> > >> I'll try to fix it in R14A.
> > >>
> > >> Can you try my patch adding a missing udp_close(S) to see
> > >> if it stops the leaking port problem? That is a more serious bug.
> > >>
> > >> >
> > >> > On Wed, May 26, 2010 at 8:02 AM, Bob Ippolito <bob@REDACTED> wrote:
> > >> > > Well, I'm not sure exactly which scenario is happening because I
> > >> > > haven't looked at the packets yet, but the manual TCP retry is
> > >> > > required.
> > >> > >
> > >> > > mochi@REDACTED:~$ /mochi/opt/erlang-R13B04/bin/erl
> > >> > > Erlang R13B04 (erts-5.7.5) [source] [64-bit] [smp:8:8] [rq:8]
> > >> > > [async-threads:4] [hipe] [kernel-poll:true]
> > >> > >
> > >> > > Eshell V5.7.5 (abort with ^G)
> > >> > > 1> lists:filter(fun erlang:is_port/1, element(2,
> > >> > > erlang:process_info(self(), links))).
> > >> > > []
> > >> > > 2> inet_res:resolve("mochisvn.erl.mochimedia.net", in, a, []).
> > >> > > {error,timeout}
> > >> > > 3> lists:filter(fun erlang:is_port/1, element(2,
> > >> > > erlang:process_info(self(), links))).
> > >> > > [#Port<0.514>]
> > >> > > 4> element(1, inet_res:resolve("mochisvn.erl.mochimedia.net", in, a, [usevc])).
> > >> > > ok
> > >> > >
> > >> > >
> > >> > > On Wed, May 26, 2010 at 2:06 AM, Raimo Niskanen
> > >> > > <raimo+erlang-bugs@REDACTED> wrote:
> > >> > >> By reading the code it seems there is a bug when all nameservers
> > >> > >> return an answer that causes decode errors, or can not be
> > >> > >> contacted (enetunreach or econnrefused); then an
> > >> > >> UDP port (or maybe two; one inet and one inet6) is leaked
> > >> > >> since the inet_res:udp_close/1 is not called.
> > >> > >>
> > >> > >> This should be fixed with:
> > >> > >>
> > >> > >> diff --git a/lib/kernel/src/inet_res.erl b/lib/kernel/src/inet_res.erl
> > >> > >> index 9b9e078..3d38a01 100644
> > >> > >> --- a/lib/kernel/src/inet_res.erl
> > >> > >> +++ b/lib/kernel/src/inet_res.erl
> > >> > >> @@ -592,6 +592,7 @@ query_retries(_Q, _NSs, _Timer, Retry, Retry, S) ->
> > >> > >> query_retries(Q, NSs, Timer, Retry, I, S0) ->
> > >> > >> Num = length(NSs),
> > >> > >> if Num =:= 0 ->
> > >> > >> + udp_close(S),
> > >> > >> {error,timeout};
> > >> > >> true ->
> > >> > >> case query_nss(Q, NSs, Timer, Retry, I, S0, []) of
> > >> > >>
> > >> > >> This "retry with TCP" trick of yours should really not be necessary
> > >> > >> since inet_res retries with TCP if it gets a truncated UDP answer.
> > >> > >> Have you got some other case when retrying with TCP is essential?
> > >> > >>
> > >> > >> Or, does your DNS server produce a (valid?) result that
> > >> > >> triggers a debug bug in inet_res, causing the decode error,
> > >> > >> triggering the port leak bug, forcing you to retry with TCP?
> > >> > >>
> > >> > >> On Tue, May 25, 2010 at 05:00:39PM -0700, Bob Ippolito wrote:
> > >> > >>> It appears that there may be an inet_udp Port leak in
> > >> > >>> inet_res:resolve/4, our current workaround is to spawn a new process
> > >> > >>> to call this function. We've noticed this primarily for a service that
> > >> > >>> regularly does a UDP DNS query that fails (because the response is too
> > >> > >>> big) and then we retry over TCP.
> > >> > >>>
> > >> > >>> This is what the state of the process looked like when it was leaking ports:
> > >> > >>>
> > >> > >>> (node@REDACTED)1> length(lists:filter(fun erlang:is_port/1, element(2,
> > >> > >>> erlang:process_info(whereis(dns_gen_server), links)))).
> > >> > >>> 577
> > >> > >>> (node@REDACTED)2> lists:usort([erlang:port_info(P, name) || P <-
> > >> > >>> lists:filter(fun erlang:is_port/1, element(2,
> > >> > >>> erlang:process_info(whereis(dns_gen_server), links)))]).
> > >> > >>> [{name,"udp_inet"}]
> > >> > >>>
> > >> > >>> The code looked like this, before the workaround was implemented:
> > >> > >>>
> > >> > >>> %% @spec dns(string()) -> [string()]
> > >> > >>> %% @doc Return the A records (IPv4 IPs) as strings for the given Host name.
> > >> > >>> %% This may return an empty list if there no A records for this Host name.
> > >> > >>> dns(Host) when is_list(Host) ->
> > >> > >>> dns(Host, fun inet_res:resolve/4).
> > >> > >>>
> > >> > >>> dns(Host, ResolveFun) ->
> > >> > >>> case ResolveFun(Host, in, a, []) of
> > >> > >>> {ok, Msg} ->
> > >> > >>> ips_for_answers(Msg);
> > >> > >>> {error, {nxdomain, _}} ->
> > >> > >>> [];
> > >> > >>> {error, timeout} ->
> > >> > >>> %% retry with TCP
> > >> > >>> case ResolveFun(Host, in, a, [{usevc, true}]) of
> > >> > >>> {ok, Msg} ->
> > >> > >>> ips_for_answers(Msg);
> > >> > >>> {error, {nxdomain, _}} ->
> > >> > >>> [];
> > >> > >>> Error = {error, _} ->
> > >> > >>> Error
> > >> > >>> end;
> > >> > >>> Error = {error, _} ->
> > >> > >>> Error
> > >> > >>> end.
> > >> > >>>
> > >> > >>> ips_for_answers(Msg) ->
> > >> > >>> [inet_parse:ntoa(inet_dns:rr(Answer, data))
> > >> > >>> || Answer <- inet_dns:msg(Msg, anlist)].
> > >> > >>>
> > >> > >>> The workaround we used was to call it indirectly with this function, I
> > >> > >>> couldn't find anything in OTP that did the same thing that didn't have
> > >> > >>> local call optimizations.
> > >> > >>>
> > >> > >>> %% @spec process_apply(atom(), atom(), [term()]) -> term()
> > >> > >>> %% @doc erlang:apply(M, F, A) in a temporary process and return the results.
> > >> > >>> process_apply(M,F,A) ->
> > >> > >>> %% We can't just use rpc here because there's a local call optimization.
> > >> > >>> Parent = self(),
> > >> > >>> Fun = fun () ->
> > >> > >>> try
> > >> > >>> Parent ! {self(), erlang:apply(M, F, A)}
> > >> > >>> catch
> > >> > >>> Class:Reason ->
> > >> > >>> Stacktrace = erlang:get_stacktrace(),
> > >> > >>> Parent ! {self(), Class, Reason, Stacktrace}
> > >> > >>> end
> > >> > >>> end,
> > >> > >>> {Pid, Ref} = erlang:spawn_monitor(Fun),
> > >> > >>> receive
> > >> > >>> {Pid, Res} ->
> > >> > >>> receive {'DOWN', Ref, process, Pid, _} -> ok end,
> > >> > >>> Res;
> > >> > >>> {Pid, Class, Reason, Stacktrace} ->
> > >> > >>> receive {'DOWN', Ref, process, Pid, _} -> ok end,
> > >> > >>> erlang:error(erlang:raise(Class, Reason, Stacktrace));
> > >> > >>> {'DOWN', Ref, process, Pid, Reason} ->
> > >> > >>> erlang:exit(Reason)
> > >> > >>> end.
> > >> > >>>
> > >> > >>> ________________________________________________________________
> > >> > >>> erlang-bugs (at) erlang.org mailing list.
> > >> > >>> See http://www.erlang.org/faq.html
> > >> > >>> To unsubscribe; mailto:erlang-bugs-unsubscribe@REDACTED
> > >> > >>
> > >> > >> --
> > >> > >>
> > >> > >> / Raimo Niskanen, Erlang/OTP, Ericsson AB
> > >> > >>
> > >> > >
> > >> >
> > >> > ________________________________________________________________
> > >> > erlang-bugs (at) erlang.org mailing list.
> > >> > See http://www.erlang.org/faq.html
> > >> > To unsubscribe; mailto:erlang-bugs-unsubscribe@REDACTED
> > >> >
> > >>
> > >> --
> > >>
> > >> / Raimo Niskanen, Erlang/OTP, Ericsson AB
> > >>
> > >> ________________________________________________________________
> > >> erlang-bugs (at) erlang.org mailing list.
> > >> See http://www.erlang.org/faq.html
> > >> To unsubscribe; mailto:erlang-bugs-unsubscribe@REDACTED
> > >
> > > --
> > >
> > > / Raimo Niskanen, Erlang/OTP, Ericsson AB
> > >
> >
> > ________________________________________________________________
> > erlang-bugs (at) erlang.org mailing list.
> > See http://www.erlang.org/faq.html
> > To unsubscribe; mailto:erlang-bugs-unsubscribe@REDACTED
> >
>
> --
>
> / Raimo Niskanen, Erlang/OTP, Ericsson AB
>
> ________________________________________________________________
> erlang-bugs (at) erlang.org mailing list.
> See http://www.erlang.org/faq.html
> To unsubscribe; mailto:erlang-bugs-unsubscribe@REDACTED
--
/ Raimo Niskanen, Erlang/OTP, Ericsson AB
More information about the erlang-bugs
mailing list