[erlang-bugs] R13B04 inet_res:resolve/4 inet_udp Port leak

Raimo Niskanen raimo+erlang-bugs@REDACTED
Thu May 27 16:26:20 CEST 2010


On Thu, May 27, 2010 at 07:11:03AM -0700, Bob Ippolito wrote:
> I can confirm that the first commit fixes the port leak bug.

Great! Then that branch should be complete. The second
commit makes your DNS reply message below decode with
the TC bit set, which should make inet_res retry with
'usevc' internally, obsoleting your wrapper (hopefully).

> 
> On Thu, May 27, 2010 at 3:02 AM, Raimo Niskanen
> <raimo+erlang-bugs@REDACTED> wrote:
> > I have created a fix for these problems:
> >    git fetch git://github.com/RaimoNiskanen/otp.git rn/resolver-leaking-ports
> >
> > It will be included in 'pu'.
> >
> > Unfortunately, the second commit eliminates the bug trigger
> > for what the first commit fixes. So to test if the bug fix
> > is fixing the bug, one should apply the first commit only.
> >
> > On Thu, May 27, 2010 at 09:43:54AM +0200, Raimo Niskanen wrote:
> >> On Wed, May 26, 2010 at 11:59:26AM -0700, Bob Ippolito wrote:
> >> > Here's the DNS packet that is being received as a response to the query:
> >> >
> >> > 1> inet_dns:decode(<<0,1,131,128,0,1,0,60,0,0,0,0,8,109,111,99,104,105,115,118,110,
> >> > 3,101,114,108,10,109,111,99,104,105,109,101,100,105,97,3,110,
> >> > 101,116,0,0,1,0,1>>).
> >> > {error,fmt}
> >>
> >> Thank you very much! I was about to give you detailed instructions
> >> about how to dig that up :-)
> >>
> >> I have spotted the problem.
> >>
> >> The DNS reply packet has got the TC (TrunCation) bit set and claims to contain
> >> 60 answer records, but actually contains zero. inet_dns expects to find
> >> 60 answer records if it says so. This is a hazy part of the
> >> DNS specifications and the resolver I tested truncation on did
> >> not do this kind of self-contradiction, but it _may_ be allowed
> >> by the specification...
> >>
> >> I regard it as a bug (or at least need-to-fix-problem) in inet_dns
> >> since it should be real-world compatible not just specification compatible.
> >> It should allow record shortage in a section if the TC bit is set.
> >> I'll try to fix it in R14A.
> >>
> >> Can you try my patch adding a missing udp_close(S) to see
> >> if it stops the leaking port problem? That is a more serious bug.
> >>
> >> >
> >> > On Wed, May 26, 2010 at 8:02 AM, Bob Ippolito <bob@REDACTED> wrote:
> >> > > Well, I'm not sure exactly which scenario is happening because I
> >> > > haven't looked at the packets yet, but the manual TCP retry is
> >> > > required.
> >> > >
> >> > > mochi@REDACTED:~$ /mochi/opt/erlang-R13B04/bin/erl
> >> > > Erlang R13B04 (erts-5.7.5) [source] [64-bit] [smp:8:8] [rq:8]
> >> > > [async-threads:4] [hipe] [kernel-poll:true]
> >> > >
> >> > > Eshell V5.7.5  (abort with ^G)
> >> > > 1> lists:filter(fun erlang:is_port/1, element(2,
> >> > > erlang:process_info(self(), links))).
> >> > > []
> >> > > 2> inet_res:resolve("mochisvn.erl.mochimedia.net", in, a, []).
> >> > > {error,timeout}
> >> > > 3> lists:filter(fun erlang:is_port/1, element(2,
> >> > > erlang:process_info(self(), links))).
> >> > > [#Port<0.514>]
> >> > > 4> element(1, inet_res:resolve("mochisvn.erl.mochimedia.net", in, a, [usevc])).
> >> > > ok
> >> > >
> >> > >
> >> > > On Wed, May 26, 2010 at 2:06 AM, Raimo Niskanen
> >> > > <raimo+erlang-bugs@REDACTED> wrote:
> >> > >> By reading the code it seems there is a bug when all nameservers
> >> > >> return an answer that causes decode errors, or can not be
> >> > >> contacted (enetunreach or econnrefused); then an
> >> > >> UDP port (or maybe two; one inet and one inet6) is leaked
> >> > >> since the inet_res:udp_close/1 is not called.
> >> > >>
> >> > >> This should be fixed with:
> >> > >>
> >> > >> diff --git a/lib/kernel/src/inet_res.erl b/lib/kernel/src/inet_res.erl
> >> > >> index 9b9e078..3d38a01 100644
> >> > >> --- a/lib/kernel/src/inet_res.erl
> >> > >> +++ b/lib/kernel/src/inet_res.erl
> >> > >> @@ -592,6 +592,7 @@ query_retries(_Q, _NSs, _Timer, Retry, Retry, S) ->
> >> > >>  query_retries(Q, NSs, Timer, Retry, I, S0) ->
> >> > >>     Num = length(NSs),
> >> > >>     if Num =:= 0 ->
> >> > >> +           udp_close(S),
> >> > >>            {error,timeout};
> >> > >>        true ->
> >> > >>            case query_nss(Q, NSs, Timer, Retry, I, S0, []) of
> >> > >>
> >> > >> This "retry with TCP" trick of yours should really not be necessary
> >> > >> since inet_res retries with TCP if it gets a truncated UDP answer.
> >> > >> Have you got some other case when retrying with TCP is essential?
> >> > >>
> >> > >> Or, does your DNS server produce a (valid?) result that
> >> > >> triggers a debug bug in inet_res, causing the decode error,
> >> > >> triggering the port leak bug, forcing you to retry with TCP?
> >> > >>
> >> > >> On Tue, May 25, 2010 at 05:00:39PM -0700, Bob Ippolito wrote:
> >> > >>> It appears that there may be an inet_udp Port leak in
> >> > >>> inet_res:resolve/4, our current workaround is to spawn a new process
> >> > >>> to call this function. We've noticed this primarily for a service that
> >> > >>> regularly does a UDP DNS query that fails (because the response is too
> >> > >>> big) and then we retry over TCP.
> >> > >>>
> >> > >>> This is what the state of the process looked like when it was leaking ports:
> >> > >>>
> >> > >>> (node@REDACTED)1> length(lists:filter(fun erlang:is_port/1, element(2,
> >> > >>> erlang:process_info(whereis(dns_gen_server), links)))).
> >> > >>> 577
> >> > >>> (node@REDACTED)2> lists:usort([erlang:port_info(P, name) || P <-
> >> > >>> lists:filter(fun erlang:is_port/1, element(2,
> >> > >>> erlang:process_info(whereis(dns_gen_server), links)))]).
> >> > >>> [{name,"udp_inet"}]
> >> > >>>
> >> > >>> The code looked like this, before the workaround was implemented:
> >> > >>>
> >> > >>> %% @spec dns(string()) -> [string()]
> >> > >>> %% @doc Return the A records (IPv4 IPs) as strings for the given Host name.
> >> > >>> %%     This may return an empty list if there no A records for this Host name.
> >> > >>> dns(Host) when is_list(Host) ->
> >> > >>>     dns(Host, fun inet_res:resolve/4).
> >> > >>>
> >> > >>> dns(Host, ResolveFun) ->
> >> > >>>     case ResolveFun(Host, in, a, []) of
> >> > >>>         {ok, Msg} ->
> >> > >>>             ips_for_answers(Msg);
> >> > >>>         {error, {nxdomain, _}} ->
> >> > >>>             [];
> >> > >>>         {error, timeout} ->
> >> > >>>             %% retry with TCP
> >> > >>>             case ResolveFun(Host, in, a, [{usevc, true}]) of
> >> > >>>                 {ok, Msg} ->
> >> > >>>                     ips_for_answers(Msg);
> >> > >>>                 {error, {nxdomain, _}} ->
> >> > >>>                     [];
> >> > >>>                 Error = {error, _} ->
> >> > >>>                     Error
> >> > >>>             end;
> >> > >>>         Error = {error, _} ->
> >> > >>>             Error
> >> > >>>     end.
> >> > >>>
> >> > >>> ips_for_answers(Msg) ->
> >> > >>>     [inet_parse:ntoa(inet_dns:rr(Answer, data))
> >> > >>>      || Answer <- inet_dns:msg(Msg, anlist)].
> >> > >>>
> >> > >>> The workaround we used was to call it indirectly with this function, I
> >> > >>> couldn't find anything in OTP that did the same thing that didn't have
> >> > >>> local call optimizations.
> >> > >>>
> >> > >>> %% @spec process_apply(atom(), atom(), [term()]) -> term()
> >> > >>> %% @doc erlang:apply(M, F, A) in a temporary process and return the results.
> >> > >>> process_apply(M,F,A) ->
> >> > >>>     %% We can't just use rpc here because there's a local call optimization.
> >> > >>>     Parent = self(),
> >> > >>>     Fun = fun () ->
> >> > >>>                   try
> >> > >>>                       Parent ! {self(), erlang:apply(M, F, A)}
> >> > >>>                   catch
> >> > >>>                       Class:Reason ->
> >> > >>>                           Stacktrace = erlang:get_stacktrace(),
> >> > >>>                           Parent ! {self(), Class, Reason, Stacktrace}
> >> > >>>                   end
> >> > >>>           end,
> >> > >>>     {Pid, Ref} = erlang:spawn_monitor(Fun),
> >> > >>>     receive
> >> > >>>         {Pid, Res} ->
> >> > >>>             receive {'DOWN', Ref, process, Pid, _} -> ok end,
> >> > >>>             Res;
> >> > >>>         {Pid, Class, Reason, Stacktrace} ->
> >> > >>>             receive {'DOWN', Ref, process, Pid, _} -> ok end,
> >> > >>>             erlang:error(erlang:raise(Class, Reason, Stacktrace));
> >> > >>>         {'DOWN', Ref, process, Pid, Reason} ->
> >> > >>>             erlang:exit(Reason)
> >> > >>>     end.
> >> > >>>
> >> > >>> ________________________________________________________________
> >> > >>> erlang-bugs (at) erlang.org mailing list.
> >> > >>> See http://www.erlang.org/faq.html
> >> > >>> To unsubscribe; mailto:erlang-bugs-unsubscribe@REDACTED
> >> > >>
> >> > >> --
> >> > >>
> >> > >> / Raimo Niskanen, Erlang/OTP, Ericsson AB
> >> > >>
> >> > >
> >> >
> >> > ________________________________________________________________
> >> > erlang-bugs (at) erlang.org mailing list.
> >> > See http://www.erlang.org/faq.html
> >> > To unsubscribe; mailto:erlang-bugs-unsubscribe@REDACTED
> >> >
> >>
> >> --
> >>
> >> / Raimo Niskanen, Erlang/OTP, Ericsson AB
> >>
> >> ________________________________________________________________
> >> erlang-bugs (at) erlang.org mailing list.
> >> See http://www.erlang.org/faq.html
> >> To unsubscribe; mailto:erlang-bugs-unsubscribe@REDACTED
> >
> > --
> >
> > / Raimo Niskanen, Erlang/OTP, Ericsson AB
> >
> 
> ________________________________________________________________
> erlang-bugs (at) erlang.org mailing list.
> See http://www.erlang.org/faq.html
> To unsubscribe; mailto:erlang-bugs-unsubscribe@REDACTED
> 

-- 

/ Raimo Niskanen, Erlang/OTP, Ericsson AB


More information about the erlang-bugs mailing list