From jay@REDACTED  Sun May  2 07:35:20 2010
From: jay@REDACTED (Jay)
Date: Sun, 02 May 2010 15:35:20 +1000
Subject: Bug in ssl_certificate.erl in R13B04
Message-ID: <op.vb2cg60szs4wx5@localhost>


I came across to the same issue with the latest Yaws (yaws-1.88) together  
with R13B04. Access to HTTPS was totally denied.

I had two choices I'll do a quick hack to R13B04 or go back to R13B03. Of  
course, I chose the harder way.

I am quite sure this is not the right way to fix it, but it seems to work  
for now:

+++ ssl_certificate.erl
@@ -147,6 +147,13 @@
                       public_key:pkix_issuer_id(ErlCertCandidate, self);
                   false ->
                       find_issuer(OtpCert, Key)
+           end;
+       {Key, [{_Cert, ErlCertCandidate, not_encrypted}]} ->
+           case public_key:pkix_is_issuer(OtpCert, ErlCertCandidate) of
+               true ->
+                   public_key:pkix_issuer_id(ErlCertCandidate, self);
+               false ->
+                   find_issuer(OtpCert, Key)
               end
        end.

Best Regards,

Jay

Sisutec
http://www.sisutec.com.au


From chetan.ahuja@REDACTED  Mon May  3 22:29:04 2010
From: chetan.ahuja@REDACTED (Chetan Ahuja)
Date: Mon, 3 May 2010 13:29:04 -0700
Subject: infinite loop when beam.smp compiled with -O2 on debian lenny
Message-ID: <q2i87eebda91005031329vb2d48a62y1aa4f45c652acd1e@mail.gmail.com>

Hi,

  We hit a bug while running rabbitmq where the beam.smp process was stuck
in a tight loop in the erts_poll_info method.
The process was eating up 100% of exactly one core (on a multi core box) and
rabbitmq was dysfunctional.  Unfortunately
I could not create a  small test case to reproduce this condition but it
would happen quite frequently while rabbitmq was in
operation.

The C code for the function didn't provide any hints on what would have been
spinning in that function
(first time looking at  this codebase though). Finally looking through the
 disassembly in gdb,  (at the point of where our process was spinning) I saw
the following  lines in the
erts_poll_info_kp method:


0x00000000004f0fe9 <erts_poll_info_kp+185>:     nopl   0x0(%rax)
0x00000000004f0ff0 <erts_poll_info_kp+192>:     jmp    0x4f0fe9
<erts_poll_info_kp+185>

(Similar assembly code  can be seen  when  the KERNEL_POLL  option is
disabled.)

 Clearly the above will trivially spin forever anytime we get into that
codepath.  The above
looks suspiciously like some code got optimized out by the compiler leaving
the crazy
loop code.

So I compiled with -O1 and then with no optimization at all.   Withe -O1, I
saw a
a weird jmp insruction jumping to it's own address:

0x0000000000517102 <erts_poll_info_kp+60>:      jmp    0x517102
<erts_poll_info_kp+60>

With no optimization,   any of those trivial spins did not exist but I
didn't analyze the unoptimized
code enough to say whether  it can be proven to have an infinite loop (i.e.,
whether the optimizing
compiler is simply doing it's job vs. this being a compiler bug).

Anyway, this problem exists at least since  erlang-base_12.b.3-dfsg debian
package version and has been
verified to exists in the  github version as of today.


 Her'es the gcc  and  debian version info:
 $ gcc --version
gcc-4.3.real (Debian 4.3.2-1.1) 4.3.2
Copyright (C) 2008 Free Software Foundation, Inc.

$ cat /etc/debian_version
5.0


I'd be happy to provide any other info as needed.

Thanks
Chetan Ahuja

From dougedmunds@REDACTED  Mon May  3 22:51:28 2010
From: dougedmunds@REDACTED (Doug Edmunds (gmail))
Date: Mon, 03 May 2010 13:51:28 -0700
Subject: Bug: process unexpectedly exits loop
Message-ID: <4BDF3750.7020200@gmail.com>

Hello,

I'm posting a module (conn3.erl) below.
This module builds a hierarchical tree of PIDs.
There are two loops, one for entries, and another
for the position in the tree(called 'me').

Each 'entry' runs a copy of the entry_loop.
Each entry keeps track of its parent (one pid)
and its children (list of 0 or more pids). Entries
are not called directly.

The me process runs in the me_loop. It manages the
entries, and moves up and down in the tree, via:
me ! add, me ! up, me ! down, me ! delete, me ! show.

The bug I've encountered is when trying to move down the
tree when there are multiple children. Here's the basic
scenario:
After running conn3:start(),
type:
	me ! add.
	me ! add.
to create two children.

Now type:
	me ! down.

Because there are more than one child, the code calls
indexlist/3, which returns a list of tuples:
[{1, PID1}, {2, PID2}, ...}.
Then the next line in the 'down' message prints
that list.

After that the user is supposed to pick the child
by using the integer:

    Input = get_user_input("Enter key: "),

But the pid exits the loop before it reaches
that line. The 'me' pid is still alive, but exits
the loop.  I get no error message.

It fails both in Windows XP and in a linux os.

If someone can figure out what the problem is, much appreciated.

Doug Edmunds

--------------------

-module(conn3_full).
-compile(export_all).

%% usage conn:start().
%% then send messages to 'me' (see me_loop)

start()     ->
%    process_flag(trap_exit,  true),
     Me = spawn(?MODULE, me_loop,[[],[],[]]),
     register(me, Me),
     Top = spawn(?MODULE, entry_loop,[[],[],[]]),
     register(top, Top),
     me ! {first_time},
     top ! {first_time},
% uncomment this next line to get to the problem faster
%    me ! add, me ! add, me! show,
     ok.

me_loop(M,K,P) ->
%    io:format("--me_loop self(): ~p M: ~p K:~p P: ~p~n",[self(),M,K,P]),
     receive

	{first_time} ->
         NM = whereis(top),
	    io:format("--setting me to top: self(): ~p M: ~p K: ~p P: ~p ~n",
                 [self(), whereis(top), K, P]),
	    NM = whereis(top), NK = K, NP = P;

	show ->
	    io:format("--show self():~p M: ~p K: ~p P: ~p~n",[self(),M,K,P]),
	    NM = M, NK = K, NP = P;

	add ->
	    %%create an entry
	    Pid = spawn(?MODULE,entry_loop,[[],[],M]),
	    Pid ! {set_pid, Pid},
	    %%update the entry that 'me' is copying
	    M ! {p_add_kid, Pid},
	    %%update 'me'
	    K2 = [Pid|K],
	    NM = M, NK = K2, NP = P;


	del ->
	    case P of
		[] -> io:format("--At the top~n");
		_  -> P ! {p_update_kids, M, K},
		      ok = connect_kids_to_P(K,P),
		      M ! die,
		      me ! up
	    end,
	    NM = M, NK = K, NP = P;


	down ->
	    case length(K) of
		0 -> io:format("--No kids~n");
		1 -> [Head |_] = K,
		    Head ! {self(), info_request};
		_ -> Out = indexlist(1, K, []),
		    ok = io:format("~p~n", [Out]),
%%%%%% When more than one 'kid',
%%%%%% process drops out of loop here.  BUG?

             Input = get_user_input("Enter key: "),
		    {Int,Rest} = string:to_integer(Input),
		    case is_integer(Int) andalso Rest == [] of
			true ->
			     Pick = pick_pid(Out,Int),
			     case is_pid(Pick) of
				 true ->
				     Pick ! {self(), info_request};
				 _ -> io:format("that number is not on the list~n")
			     end;
			_ ->
			     io:format("must enter an integer~n")
		    end
	    end,
	    NM = M, NK = K, NP = P;

	up ->
	    case P of
		[] -> io:format("--At the top~n");
		_  -> P ! {self(),info_request}
	    end,
	    NM = M, NK = K, NP = P;

	{info_requested, M2, K2, P2} ->
	    NM = M2, NK = K2, NP = P2;

	die ->
	    exit("killed"),
	    io:format("~p died~n", [self()]),
	    NM = M, NK = K, NP = P;

	Anything ->
	    io:format("--me_loop got this:~p~n", [Anything]), NM = M, NK = K, 
NP = P
     end,
     me_loop (NM,NK,NP).


entry_loop(M,K,P) ->
%    io:format("--entry_loop self(): ~p M: ~p K:~p P: ~p~n",[self(),M,K,P]),
     receive

	{first_time} ->
	    io:format("--setting top self(): ~p M: ~p K: ~p P ~p ~n",
            [self(), whereis(top), K, P]),
	    NM = whereis(top), NK = K, NP = P;

	show ->
	    io:format("--show self():~p M: ~p K: ~p P: ~p~n",[self(),M,K,P]),
	    NM = M, NK = K, NP = P;

	{set_pid, Pid} ->
	    NM = Pid, NK = K, NP = P;

	{From,info_request} ->
	    From ! {info_requested, M, K, P},
	    NM = M, NK = K, NP = P;

	{p_update_kids, Kid, GrandKidsList} ->
	    K2 = lists:delete(Kid, K),
	    K3 = lists:append(GrandKidsList,K2),
	    %%still have to move me
	    NM = M, NK = K3, NP = P;

	{kid_change_p, GrandP} ->
	    P2 = GrandP,
	    NM = M, NK = K, NP = P2;

	{p_add_kid, Pid} ->
	    K2 = [Pid|K],
	    NM = M, NK = K2, NP = P;

%	{tell_kids_about_Pid, Pid, Msg} ->
%	    Pidlist = [Pidx || Pidx <- K, is_pid(Pid), Pid /= Pidx],
%	    %%% exclude Pid
%	    %%  io:format("--Pid list: ~p~n",[Pidlist]),
%	    ok = tell_list(Pidlist, Pid, Msg),
%	    NM = M, NK = K, NP = P;

	die ->
	    exit("killed"),
	    io:format("~p died~n", [self()]),
	    NM = M, NK = K, NP = P;

	Anything ->
	    io:format("--entry_loop Got this:~p~n", [Anything]), NM = M, NK = 
K, NP = P
     end,
     %%  io:format("here i am~n"),
     entry_loop (NM,NK,NP).

indexlist(Start, [H|T],Out) ->
     NewOut = lists: append ([{Start, H}], Out),
     Start2 = Start+1,
     indexlist(Start2,  T, NewOut);
indexlist(_, [], Out) -> lists:reverse(Out).

pick_pid(Out, Key) ->
     NewDict = dict:from_list(Out),
     case dict:is_key(Key,NewDict) of
         true -> dict:fetch(Key,NewDict);
         false -> "no such key"
     end.

get_user_input( Prompt ) ->
   string:strip(   % remove spaces from front and back
     string:strip( % remove line-feed from the end
       io:get_line( Prompt), right, $\n)).


connect_kids_to_P([],_) -> ok;
connect_kids_to_P(K,P) ->
     [H|T] = K, H ! {kid_change_p,P},
     connect_kids_to_P(T,P).

%%%not implemented
% tell_list([],_,_) ->  ok;
% tell_list([H|T],X,Msg) -> H ! {Msg, X}, tell_list(T,X, Msg).


%%%macro-ish utility
b_alive(String) ->  % ie b_alive("<0.35.0>")
     is_process_alive(list_to_pid(String)).

From mikpe@REDACTED  Mon May  3 23:54:31 2010
From: mikpe@REDACTED (Mikael Pettersson)
Date: Mon, 3 May 2010 23:54:31 +0200
Subject: [erlang-bugs] infinite loop when beam.smp compiled with -O2 on debian lenny
In-Reply-To: <q2i87eebda91005031329vb2d48a62y1aa4f45c652acd1e@mail.gmail.com>
References: <q2i87eebda91005031329vb2d48a62y1aa4f45c652acd1e@mail.gmail.com>
Message-ID: <19423.17943.80950.733236@pilspetsen.it.uu.se>

Chetan Ahuja writes:
 > Hi,
 > 
 >   We hit a bug while running rabbitmq where the beam.smp process was stuck
 > in a tight loop in the erts_poll_info method.
 > The process was eating up 100% of exactly one core (on a multi core box) and
 > rabbitmq was dysfunctional.  Unfortunately
 > I could not create a  small test case to reproduce this condition but it
 > would happen quite frequently while rabbitmq was in
 > operation.
 > 
 > The C code for the function didn't provide any hints on what would have been
 > spinning in that function
 > (first time looking at  this codebase though). Finally looking through the
 >  disassembly in gdb,  (at the point of where our process was spinning) I saw
 > the following  lines in the
 > erts_poll_info_kp method:
 > 
 > 
 > 0x00000000004f0fe9 <erts_poll_info_kp+185>:     nopl   0x0(%rax)
 > 0x00000000004f0ff0 <erts_poll_info_kp+192>:     jmp    0x4f0fe9
 > <erts_poll_info_kp+185>
 > 
 > (Similar assembly code  can be seen  when  the KERNEL_POLL  option is
 > disabled.)
 > 
 >  Clearly the above will trivially spin forever anytime we get into that
 > codepath.  The above
 > looks suspiciously like some code got optimized out by the compiler leaving
 > the crazy
 > loop code.
 > 
 > So I compiled with -O1 and then with no optimization at all.   Withe -O1, I
 > saw a
 > a weird jmp insruction jumping to it's own address:
 > 
 > 0x0000000000517102 <erts_poll_info_kp+60>:      jmp    0x517102
 > <erts_poll_info_kp+60>
 > 
 > With no optimization,   any of those trivial spins did not exist but I
 > didn't analyze the unoptimized
 > code enough to say whether  it can be proven to have an infinite loop (i.e.,
 > whether the optimizing
 > compiler is simply doing it's job vs. this being a compiler bug).
 > 
 > Anyway, this problem exists at least since  erlang-base_12.b.3-dfsg debian
 > package version and has been
 > verified to exists in the  github version as of today.
 > 
 > 
 >  Her'es the gcc  and  debian version info:
 >  $ gcc --version
 > gcc-4.3.real (Debian 4.3.2-1.1) 4.3.2
 > Copyright (C) 2008 Free Software Foundation, Inc.

I looked at the procedure in question (not so easy to locate due to
some "creative" C preprocessor abuse), and noticed an obvious bug:
there's a loop over a linked list that forgets to actually advance
the node pointer to the next element. When optimizing, gcc will notice
that the loop doesn't terminate, omit the body of the loop (the
calculations are dead), which will result in the type of object code
shown above. Thus, it's an Erlang VM bug not a gcc miscompilation.

Try the patch below and let us know if it solves your problem.

/Mikael

--- otp_src_R13B03/erts/emulator/sys/common/erl_poll.c.~1~	2009-03-12 13:16:29.000000000 +0100
+++ otp_src_R13B03/erts/emulator/sys/common/erl_poll.c	2010-05-03 23:41:32.000000000 +0200
@@ -2404,6 +2404,7 @@ ERTS_POLL_EXPORT(erts_poll_info)(ErtsPol
 	while (urqbp) {
 	    size += sizeof(ErtsPollSetUpdateRequestsBlock);
 	    pending_updates += urqbp->len;
+	    urqbp = urqbp->next;
 	}
     }
 #endif

From sam@REDACTED  Tue May  4 01:23:26 2010
From: sam@REDACTED (Sam Bobroff)
Date: Tue, 04 May 2010 09:23:26 +1000
Subject: [erlang-bugs] Bug: process unexpectedly exits loop
In-Reply-To: <4BDF3750.7020200@gmail.com>
References: <4BDF3750.7020200@gmail.com>
Message-ID: <4BDF5AEE.5040401@m5net.com>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi Doug,

On 4/05/10 6:51 AM, Doug Edmunds (gmail) wrote:
> Hello,
> 
> I'm posting a module (conn3.erl) below.
> This module builds a hierarchical tree of PIDs.
> There are two loops, one for entries, and another
> for the position in the tree(called 'me').

[snip]

Getting a backtrace often helps. This is what I did:

$ erl
Erlang R13B03 (erts-5.7.4) [source] [smp:2:2] [rq:2] [async-threads:0]
[kernel-poll:false]

Eshell V5.7.4  (abort with ^G)
1> conn3_full:start().
- --setting me to top: self(): <0.34.0> M: <0.35.0> K: [] P: []
- --setting top self(): <0.35.0> M: <0.35.0> K: [] P []
ok
- --show self():<0.34.0> M: <0.35.0> K: [<0.37.0>,<0.36.0>] P: []
2> me ! down.
[{1,<0.37.0>},{2,<0.36.0>}]
down
3> {backtrace, BT} = process_info(whereis(me), backtrace).
{backtrace,<<"Program counter: 0x0079f3c8 (io:wait_io_mon_reply/2 +
28)\nCP: 0x00000000 (invalid)\narity = 0\n\n0x002f6cbc Ret"...>>}
4> io:fwrite("~s\n", [binary_to_list(BT)]).
Program counter: 0x0079f3c8 (io:wait_io_mon_reply/2 + 28)
CP: 0x00000000 (invalid)
arity = 0

0x002f6cbc Return addr 0x007a14c0 (conn3_full:get_user_input/1 + 20)
y(0)     #Ref<0.0.0.37>
y(1)     <0.25.0>

0x002f6cc8 Return addr 0x007a0bf4 (conn3_full:me_loop/3 + 676)

0x002f6ccc Return addr 0x001a1df4 (<terminate process normally>)
y(0)     []
y(1)     [{1,<0.37.0>},{2,<0.36.0>}]
y(2)     []
y(3)     [<0.37.0>,<0.36.0>]
y(4)     <0.35.0>

ok

I can see that "me" is still in it's loop and that it's currently in
"io:wait_io_mon_reply". I don't know exactly what this function is but
my guess would be it's something to do with the shell and io:get_line
(actually wait_io_mon_reply) fighting over the terminal input. If we try
again with -noshell it might be better but then we won't be able to use
the shell to send messages to "me".

So, I modified the source to add "me ! down" in the set up sequence at
line 16, and also uncommented the debug at the top of me_loop, and now I
get:

$ erl -noshell -run conn3_full
- --me_loop self(): <0.29.0> M: [] K:[] P: []
- --setting top self(): <0.30.0> M: <0.30.0> K: [] P []
- --setting me to top: self(): <0.29.0> M: <0.30.0> K: [] P: []
- --me_loop self(): <0.29.0> M: <0.30.0> K:[] P: []
- --me_loop self(): <0.29.0> M: <0.30.0> K:[<0.31.0>] P: []
- --me_loop self(): <0.29.0> M: <0.30.0> K:[<0.32.0>,<0.31.0>] P: []
- --show self():<0.29.0> M: <0.30.0> K: [<0.32.0>,<0.31.0>] P: []
- --me_loop self(): <0.29.0> M: <0.30.0> K:[<0.32.0>,<0.31.0>] P: []
[{1,<0.32.0>},{2,<0.31.0>}]
Enter key: 2
- --me_loop self(): <0.29.0> M: <0.30.0> K:[<0.32.0>,<0.31.0>] P: []
- --me_loop self(): <0.29.0> M: <0.31.0> K:[] P: <0.30.0>

I entered "2" at the prompt and the loop has continued :-)

Does that help?

Sam.
- -- 
Sam Bobroff | sam@REDACTED | M5 Networks
Why does my email have those funny headers? Because I use PGP to sign
my email (and you should too!): that's how you know it's really from me.
See: http://en.wikipedia.org/wiki/Pretty_Good_Privacy
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Darwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkvfWu4ACgkQm97/UHSa/ASGbgCfaTDDP+03OxOKCaPzcYT46KGU
0b4An0mNhTQl6prV6AIML03ptKzMBACT
=m0bU
-----END PGP SIGNATURE-----

From chetan.ahuja@REDACTED  Tue May  4 01:43:27 2010
From: chetan.ahuja@REDACTED (Chetan Ahuja)
Date: Mon, 3 May 2010 16:43:27 -0700
Subject: [erlang-bugs] infinite loop when beam.smp compiled with -O2 on 
	debian lenny
In-Reply-To: <19423.17943.80950.733236@pilspetsen.it.uu.se>
References: <q2i87eebda91005031329vb2d48a62y1aa4f45c652acd1e@mail.gmail.com>
	 <19423.17943.80950.733236@pilspetsen.it.uu.se>
Message-ID: <h2g87eebda91005031643je56b1e1cmdcc1c0e4a2cdf2b1@mail.gmail.com>

Mikeal,

   Thanks a lot for that catch. I think that's it. Just did recompiles with
your patch (with -O2) and the body of the loop now shows up in the generated
code and the trivial spin loop is gone.

 I got blindsided by the optimizer completely  eliminating the body of the
loop, due to which I couldn't  even see urbqp on the stack at all !! This
 led me to the assumption that the surrounding macro
(ERTS_POLL_USE_UPDATE_REQUESTS_QUEUE)  was perhaps undefined and that loop
wasn't even compiled in. Yet another  strike against coding C in
pre-processor macros.

  Overall,  it's a  big relief to know that our standard install of gcc is
not generating such obviously buggy code.  I look forward to seeing the
erts_poll_info fix in an upcoming git version.

Thanks a lot once again
Chetan


On Mon, May 3, 2010 at 2:54 PM, Mikael Pettersson <mikpe@REDACTED> wrote:

> Chetan Ahuja writes:
>  > Hi,
>  >
>  >   We hit a bug while running rabbitmq where the beam.smp process was
> stuck
>  > in a tight loop in the erts_poll_info method.
>  > The process was eating up 100% of exactly one core (on a multi core box)
> and
>  > rabbitmq was dysfunctional.  Unfortunately
>  > I could not create a  small test case to reproduce this condition but it
>  > would happen quite frequently while rabbitmq was in
>  > operation.
>  >
>  > The C code for the function didn't provide any hints on what would have
> been
>  > spinning in that function
>  > (first time looking at  this codebase though). Finally looking through
> the
>  >  disassembly in gdb,  (at the point of where our process was spinning) I
> saw
>  > the following  lines in the
>  > erts_poll_info_kp method:
>  >
>  >
>  > 0x00000000004f0fe9 <erts_poll_info_kp+185>:     nopl   0x0(%rax)
>  > 0x00000000004f0ff0 <erts_poll_info_kp+192>:     jmp    0x4f0fe9
>  > <erts_poll_info_kp+185>
>  >
>  > (Similar assembly code  can be seen  when  the KERNEL_POLL  option is
>  > disabled.)
>  >
>  >  Clearly the above will trivially spin forever anytime we get into that
>  > codepath.  The above
>  > looks suspiciously like some code got optimized out by the compiler
> leaving
>  > the crazy
>  > loop code.
>  >
>  > So I compiled with -O1 and then with no optimization at all.   Withe
> -O1, I
>  > saw a
>  > a weird jmp insruction jumping to it's own address:
>  >
>  > 0x0000000000517102 <erts_poll_info_kp+60>:      jmp    0x517102
>  > <erts_poll_info_kp+60>
>  >
>  > With no optimization,   any of those trivial spins did not exist but I
>  > didn't analyze the unoptimized
>  > code enough to say whether  it can be proven to have an infinite loop
> (i.e.,
>  > whether the optimizing
>  > compiler is simply doing it's job vs. this being a compiler bug).
>  >
>  > Anyway, this problem exists at least since  erlang-base_12.b.3-dfsg
> debian
>  > package version and has been
>  > verified to exists in the  github version as of today.
>  >
>  >
>  >  Her'es the gcc  and  debian version info:
>  >  $ gcc --version
>  > gcc-4.3.real (Debian 4.3.2-1.1) 4.3.2
>  > Copyright (C) 2008 Free Software Foundation, Inc.
>
> I looked at the procedure in question (not so easy to locate due to
> some "creative" C preprocessor abuse), and noticed an obvious bug:
> there's a loop over a linked list that forgets to actually advance
> the node pointer to the next element. When optimizing, gcc will notice
> that the loop doesn't terminate, omit the body of the loop (the
> calculations are dead), which will result in the type of object code
> shown above. Thus, it's an Erlang VM bug not a gcc miscompilation.
>
> Try the patch below and let us know if it solves your problem.
>
> /Mikael
>
> --- otp_src_R13B03/erts/emulator/sys/common/erl_poll.c.~1~      2009-03-12
> 13:16:29.000000000 +0100
> +++ otp_src_R13B03/erts/emulator/sys/common/erl_poll.c  2010-05-03
> 23:41:32.000000000 +0200
> @@ -2404,6 +2404,7 @@ ERTS_POLL_EXPORT(erts_poll_info)(ErtsPol
>        while (urqbp) {
>            size += sizeof(ErtsPollSetUpdateRequestsBlock);
>            pending_updates += urqbp->len;
> +           urqbp = urqbp->next;
>        }
>     }
>  #endif
>

From dougedmunds@REDACTED  Tue May  4 06:41:26 2010
From: dougedmunds@REDACTED (Doug Edmunds (gmail))
Date: Mon, 03 May 2010 21:41:26 -0700
Subject: [erlang-bugs] Bug: process unexpectedly exits loop
In-Reply-To: <4BDF5AEE.5040401@m5net.com>
References: <4BDF3750.7020200@gmail.com> <4BDF5AEE.5040401@m5net.com>
Message-ID: <4BDFA576.6030605@gmail.com>

On 5/3/2010 4:23 PM, Sam Bobroff wrote:

> So, I modified the source to add "me ! down" in the set up sequence at
> line 16,

It only works the first time.  Follow it with interactive
me ! up, then me ! down,
and you are right back at the problem.

The objective is to allow the 'me' process to traverse the entire
tree, and to add/delete branches.  Being unable to use me ! down
(with a selection) only one time isn't adequate to the goal.   I
am trying to figure out a way to work around whatever conflict is
happening.

-- Doug Edmunds


From bgustavsson@REDACTED  Tue May  4 08:27:26 2010
From: bgustavsson@REDACTED (=?UTF-8?Q?Bj=C3=B6rn_Gustavsson?=)
Date: Tue, 4 May 2010 08:27:26 +0200
Subject: [erlang-bugs] sys:get_status kills gen_servers registered 
	globally with something other than an atom
In-Reply-To: <4BDA74F1.3050509@microforte.com>
References: <4BDA74F1.3050509@microforte.com>
Message-ID: <t2p6672d0161005032327u51a090fcreae4bd0321ebc9a0@mail.gmail.com>

2010/4/30 Paul Hampson <PaulH@REDACTED>:
> Using Erlang R13B04, I'm creating gen_servers with gen_server:start( { global, "Name" } ). This is valid according to the gen_server manpage, which states GlobalName is term().
>
> However, if I call sys:get_status({global, "Name"}) (or sys_get:status( global:whereis_name( "Name" ) ) to rule out the name lookup as an issue) the gen_server dies, with:
>
> ? ?exception error: no true branch found when evaluating an if expression
> ? ? ?in function ?gen_server:format_status/2
> ? ? ?in call from sys:get_status/5
> ? ? ?in call from sys:do_cmd/6
> ? ? ?in call from sys:handle_system_msg/8
>
> This is because of the following code in gen_server:format_status:
>
> ? ?NameTag = if is_pid(Name) ->
> ? ? ? ? ? ? ?pid_to_list(Name);
> ? ? ? ? is_atom(Name) ->
> ? ? ? ? ? ? ?Name
> ? ? ? ? ?end,
> ? ?Header = lists:concat(["Status for generic server ", NameTag]),
>
> Which fails to handle that Name (which is stripped of {global,} in gen_server:name/1 or gen_server:get_proc_name/1) may be something other than an atom or pid.
>
> Interestingly, the comment above gen_server:start/3 indicates that the supplied server name is { global, atom() }, not { global, term() } as per the documentation.
>
> So either the documentation is wrong, or the gen_server implementation/comment is wrong.
>
> Bizarrely, I'm sure I was able to use sys:get_status against these same gen_servers a month ago, which would have been R13B03 or maybe an B13B03, but erlang/otp on github doesn't indicate any relevant changes.

The relevant change is in sys.erl in this commit:

http://github.com/erlang/otp/commit/88b530ea24977081020feb2123124063e58dfc12

The gen_server:format_status/2 function did not get called before that change.
Since that change introduces useful functionality, we don't plan to revert it,
but to fix the damage in R14.

One way to fix that would problem could be simply to change the
code for calculating NameTag to:

    NameTag = if is_pid(Name) ->
		      pid_to_list(Name);
		 _ ->
		      Name
	      end,

-- 
Bj?rn Gustavsson, Erlang/OTP, Ericsson AB

From mevans@REDACTED  Tue May  4 16:12:04 2010
From: mevans@REDACTED (Evans, Matthew)
Date: Tue, 4 May 2010 10:12:04 -0400
Subject: pg2 is broken R13B04
Message-ID: <B293D35E9E20694A8D9DFFE1BC655C555683E6ECE0@vvexch.verivue.com>

Hi,

So after more tests I have seen that pg2 is definitely not working as intended.

It appears that the root problem is how a new instantiation of pg2 within a cluster of Erlang nodes gets its data.

The following sequences of events occur.

1) All nodes do a net_kernel:monitor_nodes(true) in the init function of pg2.

2) The new instance of pg2 will send {new_pg2, node()} to all other nodes in the pool.

3) The new instance of pg2 will send {nodeup, Node} to itself (where nodes is a list of nodes()).

What it appears is that when only 2 nodes are in the pool things are generally ok. However, the synchronization process gets muddied when there are many members.

The process of updating the nodes is that upon receipt of {new_pg2,Node} or {nodeup,Node} to literally go through the table of pids in the ets pg2_table and build a list similar to:

[proxy_micro_cache,[<6325.319.0>,<6324.324.0>]]]

This is dispatched to the new pg2 instance. The problem is every node does that, so the new pg2 instance will end up with a table like:

[proxy_micro_cache,
  [<6437.319.0>,<6437.319.0>,<6437.319.0>,<6437.319.0>,
   <6437.319.0>,<6437.319.0>,<6436.324.0>,<6436.324.0>,
   <6436.324.0>,<6436.324.0>,<6436.324.0>,<6436.324.0>]]]

Where there are many instances for each Pid since each node has sent its copy of the data, causing that process to be replicated many times (i.e. the call to ets:update_counter in pg2:join_group)!!!

This problem is compounded further on nodes that join later, or when a VM stops and is restarted.

An additional problem arises when another process in the new group sends pg2:join on its own. In that case there is a timing window whereby the new instance could get that new entry more than once.

My recommendation is:

1) Have a new pg2:join function called pg2:join_once. In this case a process will never be permitted to have more than 1 join.

2) When a new node joins one could either select only one node to get its data from, or have all nodes in the system send the result of ets:tab2list(pg2_table) to the new node, then have that data inserted directly into its local ets table, as opposed to going through the process of join_group (possibly with the additional step erlang:monitor/2). In this way a process that has been registered more than once would be inserted into the local ets table as a single operation as opposed to many times.

3) Possibly defer new requests to pg2:join on the new instance until synchronization is complete.

I understand that gproc is on the way, but I suspect that pg2 does need fixing.

Regards

Matt

From mevans@REDACTED  Tue May  4 16:16:18 2010
From: mevans@REDACTED (Evans, Matthew)
Date: Tue, 4 May 2010 10:16:18 -0400
Subject: pg2 is broken R13B04
Message-ID: <B293D35E9E20694A8D9DFFE1BC655C555683E6ECE7@vvexch.verivue.com>

I should also add that we have had pg2 crash a VM on us, where we have some groups with in excess of 1,000,000 members (when only 140 processes have done a pg2:join - the join is done in the process's init function so we know it's only sent once).

Of course, what happens is:

1) A huge message is built and sent.

2) This message is created and processes in a list comprehension.

We have implemented a workaround by creating pg3 that only permits a single join per process.

We did this in the pg2:join_group by modifying the ets:update counter UpdateOp from {2,+1}, to {2,+1,1,1} (and similar logic in pg2:leave_group).

Matt

________________________________
From: Evans, Matthew
Sent: Tuesday, May 04, 2010 10:12 AM
To: 'erlang-bugs@REDACTED'
Subject: pg2 is broken R13B04

Hi,

So after more tests I have seen that pg2 is definitely not working as intended.

It appears that the root problem is how a new instantiation of pg2 within a cluster of Erlang nodes gets its data.

The following sequences of events occur.

1) All nodes do a net_kernel:monitor_nodes(true) in the init function of pg2.

2) The new instance of pg2 will send {new_pg2, node()} to all other nodes in the pool.

3) The new instance of pg2 will send {nodeup, Node} to itself (where nodes is a list of nodes()).

What it appears is that when only 2 nodes are in the pool things are generally ok. However, the synchronization process gets muddied when there are many members.

The process of updating the nodes is that upon receipt of {new_pg2,Node} or {nodeup,Node} to literally go through the table of pids in the ets pg2_table and build a list similar to:

[proxy_micro_cache,[<6325.319.0>,<6324.324.0>]]]

This is dispatched to the new pg2 instance. The problem is every node does that, so the new pg2 instance will end up with a table like:

[proxy_micro_cache,
  [<6437.319.0>,<6437.319.0>,<6437.319.0>,<6437.319.0>,
   <6437.319.0>,<6437.319.0>,<6436.324.0>,<6436.324.0>,
   <6436.324.0>,<6436.324.0>,<6436.324.0>,<6436.324.0>]]]

Where there are many instances for each Pid since each node has sent its copy of the data, causing that process to be replicated many times (i.e. the call to ets:update_counter in pg2:join_group)!!!

This problem is compounded further on nodes that join later, or when a VM stops and is restarted.

An additional problem arises when another process in the new group sends pg2:join on its own. In that case there is a timing window whereby the new instance could get that new entry more than once.

My recommendation is:

1) Have a new pg2:join function called pg2:join_once. In this case a process will never be permitted to have more than 1 join.

2) When a new node joins one could either select only one node to get its data from, or have all nodes in the system send the result of ets:tab2list(pg2_table) to the new node, then have that data inserted directly into its local ets table, as opposed to going through the process of join_group (possibly with the additional step erlang:monitor/2). In this way a process that has been registered more than once would be inserted into the local ets table as a single operation as opposed to many times.

3) Possibly defer new requests to pg2:join on the new instance until synchronization is complete.

I understand that gproc is on the way, but I suspect that pg2 does need fixing.

Regards

Matt

From peterke@REDACTED  Sun May  9 21:27:56 2010
From: peterke@REDACTED (=?ISO-8859-1?Q?P=E9ter_Szil=E1gyi?=)
Date: Sun, 9 May 2010 22:27:56 +0300
Subject: epipe error on port, uncatchable exception
Message-ID: <s2w1d492d71005091227o956af20fub3d2c82cc5c8aeb7@mail.gmail.com>

Hi,

  I've been trying to get some basic port operations going, but sometimes I
get a very peculiar error: epipe exception. The problem is that according to
the documentation this should never happen, yet it does, what's more ,
completely randomly. I can execute the same command and one time is
succeeds, another time it fails (let's say 1/10 failures). The even more
interesting part is, that I cannot catch the exception.

  I've written a very basic module to reproduce the error, which just
executes "ls -al" 1000 times (see below), passing in a small input data
(this is the reason of the crash).

  The exception below doesn't happen on all machines (I'm using openSuSE
11.2 x64, with Erlang R13B04 (also x64)). On an Ubuntu it ran just fine. Now
it may turn out that the OS is doing something strange causing the broken
pipes, BUT even so, I should be able to catch it.

Any feedback is appreciated,
  Peter

portbug.erl:
----------
-module(portbug).
-compile(export_all).

crash_it() ->
        try        lists:foreach(fun(_) -> do_something_portlike() end,
lists:seq(1, 1000))
        catch
                Class:Exception -> io:format("Caught: ~p:~p", [Class,
Exception])
        end.

do_something_portlike() ->
        Command = "ls -al",
        Port    = open_port({spawn, Command}, [stream, use_stdio,
stderr_to_stdout, binary, eof]),
        Port ! {self(), {command, <<"some random data">>}},
        Port ! {self(), close}.
----------
(shell@REDACTED)232> portbug:crash_it().

exception exit: epipe**
----------

From vances@REDACTED  Mon May 10 02:57:41 2010
From: vances@REDACTED (Vance Shipley)
Date: Sun, 9 May 2010 20:57:41 -0400
Subject: Inets httpd ignores debug options
Message-ID: <20100510005741.GF96961@h216-235-12-174.host.egate.net>

The init/1 callback in the supervisor simply ignores the 
documented debug functions.  That wasted a bit of my time.

-- 
	-Vance

From fritchie@REDACTED  Sat May 15 23:01:16 2010
From: fritchie@REDACTED (Scott Lystig Fritchie)
Date: Sat, 15 May 2010 16:01:16 -0500
Subject: net_kernel hang, perhaps blocked by busy_dist_port race?
Message-ID: <63794.1273957276@snookles.snookles.com>

Hi, all.  We've been bitten by a rather mysterious bug that has
disrupted Erlang message passing on roughly 10% of all nodes in a 100+
node cluster.  The same thing happened on 10 nodes within a 2-3 second
time window.  No further communication with the affected nodes via
Erlang message passing is possible.

I'm wondering if there's a possible race condition when two nodes A
and Z are communicating with each other, like this:

   1. Z makes a bunch of RPCs to A.
   2. A starts sending RPC replies to Z.
   3. Z decides to behave erratically, cause unknown.
   4. A's TCP connection to Z becomes "busy", probably because Z
      cannot or will not read data on the A <-> Z TCP connection.
   5. All processes on A that are trying to reply to Z are blocked and
      unscheduled; 'busy_dist_port' messages are generated for all of
      them. 
   6. The 'net_kernel' process on A is one of the procs blocked by the
      'busy_dist_port' events.
   7. A's connection to Z is broken.  The system message reported is:
      {nodedown_reason,connection_closed},{node_type,visible}

... and then A's 'net_kernel' process remains blocked forever?  Or is
alive but isn't working correctly?

Details below.  Sorry I don't have a patch available.

-Scott

Environment
-----------

    * ErlangOTP R13B04, -smp auto +A 64 +K true
        - Patched to change "erts_de_busy_limit" from default 
          128KB to 4096KB
    * Linux kernel, RedHat EL4 kernel IIRC of some flavor
    * NTP configured and running correctly on all machines (to help
      correlate log file timestamps)
    * Cluster of 50+ physical machines, 100+ Erlang VMs/nodes total
    * All nodes are using erlang:system_monitor() BIF for big
      heap, long garbage collection, and busy dist port events.
    * All nodes report node up and down events via BIF
      erlang:process_flag(monitor_nodes,true).

Sequence of events summary
--------------------------

1. One node in the cluster (checking the health of other nodes) makes
several thousand gen_server RPC calls to various servers on all other
nodes in somewhere between 1 and 5 second cycles (depending on what's
being monitored).  This node's name is 'app@REDACTED'.

2. The 'app@REDACTED' hits a weird problem.  We still can't figure out
what happened, but it behaved like an extremely intermittent network
partition that only affected boxZ.

3. Within 2-3 seconds, 10 nodes on the network become completely
unresponsive and cannot recover:

    * The 'app@REDACTED' node cannot talk to them, but that's not
      surprising because 'app@REDACTED' is having its own problems.
    * All other nodes report net_tick_timeout errors.
    * All attempts such as "erl -sname tmp$$ -remsh app@REDACTED" to
      connect to the hosed node fail.

Sequence of events detail
-------------------------

* At time T, +- 1 second, there are multiple reports of the same net
distribution port being blocked, e.g. #Port<0.213633546> on app@REDACTED:

  a. {monitor,<0.6824.3617>,busy_dist_port,#Port<0.213633546>}
     This is from the VM, triggered by the system_monitor BIF()

  b. sysmon_server: process <0.6824.3617> info:
     [{registered_name,foo},{initial_call,{proc_lib,init_p,5}},{current_function,{erlang,bif_return_trap,1}}]
     This is from the system_monitor event collector, which tries to
     find some helpful info about the process.

All 10 machines register anywhere from 8 to 15 of these pairs of
messages.  For each machine, all complaints are about the same Erlang
port #.

* Within T + 1 seconds, there's a report on the same port # that the
net_kernel process has been blocked, e.g. on app@REDACTED:

  a. {monitor,<0.23.0>,busy_dist_port,#Port<0.213633546>}

  b. sysmon_server: process <0.23.0> info:
     [{registered_name,net_kernel},{initial_call,{proc_lib,init_p,5}},{current_function,{erlang,bif_return_trap,1}}]

There is no direct evidence that shows that the blocked ports, e.g.
#Port<0.213633546> on app@REDACTED, are the ones that are used for
communication with app@REDACTED node, but it appears quite likely to be
true.

* Within T + 3 seconds (and usually within T + 2 seconds), there's a
  report that app@REDACTED is down, e.g. on app@REDACTED:

  net_kernel: node app@REDACTED down info [{nodedown_reason,connection_closed},{node_type,visible}]

* 20 seconds later, all other nodes in the cluster drop their connections
to these 9 nodes, due to {nodedown_reason,net_tick_timeout} reason.

* No further communication via Erlang message passing is possible:
  existing nodes cannot reconnect, and new nodes (e.g. "erl -remsh
  app@REDACTED") cannot connect.

* We used "gcore" to snag core dumps from 4 of the 10 affected nodes.
  The GDB backtrace doesn't reveal much to my untrained eyes.

GNU gdb Fedora (6.8-27.el5)
Copyright (C) 2008 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu"...
Reading symbols from /lib64/libutil.so.1...done.
Loaded symbols for /lib64/libutil.so.1
Reading symbols from /lib64/libdl.so.2...done.
Loaded symbols for /lib64/libdl.so.2
Reading symbols from /lib64/libm.so.6...done.
Loaded symbols for /lib64/libm.so.6
Reading symbols from /usr/lib64/libncurses.so.5...done.
Loaded symbols for /usr/lib64/libncurses.so.5
Reading symbols from /lib64/libpthread.so.0...done.
Loaded symbols for /lib64/libpthread.so.0
Reading symbols from /lib64/librt.so.1...done.
Loaded symbols for /lib64/librt.so.1
Reading symbols from /lib64/libc.so.6...done.
Loaded symbols for /lib64/libc.so.6
Reading symbols from /lib64/ld-linux-x86-64.so.2...done.
Loaded symbols for /lib64/ld-linux-x86-64.so.2
Reading symbols from /ert/lib/erlang/lib/crypto-1.6.4/priv/lib/crypto_drv.so...done.
Loaded symbols for /ert/lib/erlang/lib/crypto-1.6.4/priv/lib/crypto_drv.so
Reading symbols from /ert/openssl/lib/libcrypto.so.0.9.8...done.
Loaded symbols for /ert/openssl/lib/libcrypto.so.0.9.8
Core was generated by `/ert/lib/erlang/erts-5.7.5/bin/beam.smp'.
[New process 18963]
[New process 18964]
[New process 18965]
[New process 18966]
[New process 18967]
[New process 18968]
[New process 18969]
[New process 18970]
[New process 18971]
[New process 18972]
[New process 18973]
[New process 18974]
[New process 18975]
[New process 18976]
[New process 18977]
[New process 18978]
[New process 18979]
[New process 18980]
[New process 18981]
[New process 18982]
[New process 18983]
[New process 18984]
[New process 18985]
[New process 18986]
[New process 18987]
[New process 18988]
[New process 18989]
[New process 18990]
[New process 18991]
[New process 18992]
[New process 18993]
[New process 18994]
[New process 18995]
[New process 18996]
[New process 18997]
[New process 18998]
[New process 18999]
[New process 19000]
[New process 19001]
[New process 19002]
[New process 19003]
[New process 19004]
[New process 19005]
[New process 19006]
[New process 19007]
[New process 19008]
[New process 19009]
[New process 19010]
[New process 19011]
[New process 19012]
[New process 19013]
[New process 19014]
[New process 19015]
[New process 19016]
[New process 19017]
[New process 19018]
[New process 19019]
[New process 19020]
[New process 19021]
[New process 19022]
[New process 19023]
[New process 19024]
[New process 19025]
[New process 19026]
[New process 19027]
[New process 19028]
[New process 19029]
[New process 19030]
[New process 19031]
[New process 19032]
[New process 19033]
[New process 19034]
[New process 19035]
[New process 19036]
[New process 19037]
[New process 19038]
[New process 19039]
[New process 19040]
[New process 19041]
[New process 19042]
[New process 19043]
[New process 19044]
[New process 19045]
[New process 18956]
#0  0x0000003383c0d2cb in read () from /lib64/libpthread.so.0
(gdb) thread apply all where

Thread 84 (process 18956):
#0  0x00000033830cc5e2 in select () from /lib64/libc.so.6
#1  0x000000000052a900 in erts_sys_main_thread () at sys/unix/sys.c:3019
#2  0x000000000044d1ef in erl_start (argc=35, argv=<value optimized out>) at beam/erl_init.c:1330
#3  0x0000000000430429 in main (argc=0, argv=0x0) at sys/unix/erl_main.c:29

Thread 83 (process 19045):
#0  0x0000003383c0a899 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x000000000049b6e2 in sched_cnd_wait (no=16, rq=0x2b035e3bb3b0) at beam/erl_threads.h:632
#2  0x00000000004a1a93 in schedule (p=<value optimized out>, calls=<value optimized out>) at beam/erl_process.c:6026
#3  0x000000000050eb2d in process_main () at beam/beam_emu.c:1161
#4  0x000000000049f322 in sched_thread_func (vesdp=<value optimized out>) at beam/erl_process.c:3060
#5  0x0000000000585064 in thr_wrapper (vtwd=<value optimized out>) at common/ethread.c:475
#6  0x0000003383c06367 in start_thread () from /lib64/libpthread.so.0
#7  0x00000033830d309d in clone () from /lib64/libc.so.6

Thread 82 (process 19044):
#0  0x0000003383c0a899 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x000000000049b6e2 in sched_cnd_wait (no=15, rq=0x2b035e3bb1b0) at beam/erl_threads.h:632
#2  0x00000000004a1a93 in schedule (p=<value optimized out>, calls=<value optimized out>) at beam/erl_process.c:6026
#3  0x000000000050eb2d in process_main () at beam/beam_emu.c:1161
#4  0x000000000049f322 in sched_thread_func (vesdp=<value optimized out>) at beam/erl_process.c:3060
#5  0x0000000000585064 in thr_wrapper (vtwd=<value optimized out>) at common/ethread.c:475
#6  0x0000003383c06367 in start_thread () from /lib64/libpthread.so.0
#7  0x00000033830d309d in clone () from /lib64/libc.so.6

Thread 81 (process 19043):
#0  0x0000003383c0a899 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x000000000049b6e2 in sched_cnd_wait (no=14, rq=0x2b035e3bafb0) at beam/erl_threads.h:632
#2  0x00000000004a1a93 in schedule (p=<value optimized out>, calls=<value optimized out>) at beam/erl_process.c:6026
#3  0x000000000050eb2d in process_main () at beam/beam_emu.c:1161
#4  0x000000000049f322 in sched_thread_func (vesdp=<value optimized out>) at beam/erl_process.c:3060
#5  0x0000000000585064 in thr_wrapper (vtwd=<value optimized out>) at common/ethread.c:475
#6  0x0000003383c06367 in start_thread () from /lib64/libpthread.so.0
#7  0x00000033830d309d in clone () from /lib64/libc.so.6

Thread 80 (process 19042):
#0  0x0000003383c0a899 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x000000000049b6e2 in sched_cnd_wait (no=13, rq=0x2b035e3badb0) at beam/erl_threads.h:632
#2  0x00000000004a1a93 in schedule (p=<value optimized out>, calls=<value optimized out>) at beam/erl_process.c:6026
#3  0x000000000050eb2d in process_main () at beam/beam_emu.c:1161
#4  0x000000000049f322 in sched_thread_func (vesdp=<value optimized out>) at beam/erl_process.c:3060
#5  0x0000000000585064 in thr_wrapper (vtwd=<value optimized out>) at common/ethread.c:475
#6  0x0000003383c06367 in start_thread () from /lib64/libpthread.so.0
#7  0x00000033830d309d in clone () from /lib64/libc.so.6

Thread 79 (process 19041):
#0  0x0000003383c0a899 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x000000000049b6e2 in sched_cnd_wait (no=12, rq=0x2b035e3babb0) at beam/erl_threads.h:632
#2  0x00000000004a1a93 in schedule (p=<value optimized out>, calls=<value optimized out>) at beam/erl_process.c:6026
#3  0x000000000050eb2d in process_main () at beam/beam_emu.c:1161
#4  0x000000000049f322 in sched_thread_func (vesdp=<value optimized out>) at beam/erl_process.c:3060
#5  0x0000000000585064 in thr_wrapper (vtwd=<value optimized out>) at common/ethread.c:475
#6  0x0000003383c06367 in start_thread () from /lib64/libpthread.so.0
#7  0x00000033830d309d in clone () from /lib64/libc.so.6

Thread 78 (process 19040):
#0  0x0000003383c0a899 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x000000000049b6e2 in sched_cnd_wait (no=11, rq=0x2b035e3ba9b0) at beam/erl_threads.h:632
#2  0x00000000004a1a93 in schedule (p=<value optimized out>, calls=<value optimized out>) at beam/erl_process.c:6026
#3  0x000000000050eb2d in process_main () at beam/beam_emu.c:1161
#4  0x000000000049f322 in sched_thread_func (vesdp=<value optimized out>) at beam/erl_process.c:3060
#5  0x0000000000585064 in thr_wrapper (vtwd=<value optimized out>) at common/ethread.c:475
#6  0x0000003383c06367 in start_thread () from /lib64/libpthread.so.0
#7  0x00000033830d309d in clone () from /lib64/libc.so.6

Thread 77 (process 19039):
#0  0x0000003383c0a899 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x000000000049b6e2 in sched_cnd_wait (no=10, rq=0x2b035e3ba7b0) at beam/erl_threads.h:632
#2  0x00000000004a1a93 in schedule (p=<value optimized out>, calls=<value optimized out>) at beam/erl_process.c:6026
#3  0x000000000050eb2d in process_main () at beam/beam_emu.c:1161
#4  0x000000000049f322 in sched_thread_func (vesdp=<value optimized out>) at beam/erl_process.c:3060
#5  0x0000000000585064 in thr_wrapper (vtwd=<value optimized out>) at common/ethread.c:475
#6  0x0000003383c06367 in start_thread () from /lib64/libpthread.so.0
#7  0x00000033830d309d in clone () from /lib64/libc.so.6

Thread 76 (process 19038):
#0  0x0000003383c0a899 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x000000000049b6e2 in sched_cnd_wait (no=9, rq=0x2b035e3ba5b0) at beam/erl_threads.h:632
#2  0x00000000004a1a93 in schedule (p=<value optimized out>, calls=<value optimized out>) at beam/erl_process.c:6026
#3  0x000000000050eb2d in process_main () at beam/beam_emu.c:1161
#4  0x000000000049f322 in sched_thread_func (vesdp=<value optimized out>) at beam/erl_process.c:3060
#5  0x0000000000585064 in thr_wrapper (vtwd=<value optimized out>) at common/ethread.c:475
#6  0x0000003383c06367 in start_thread () from /lib64/libpthread.so.0
#7  0x00000033830d309d in clone () from /lib64/libc.so.6

Thread 75 (process 19037):
#0  0x0000003383c0a899 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x000000000049b6e2 in sched_cnd_wait (no=8, rq=0x2b035e3ba3b0) at beam/erl_threads.h:632
#2  0x00000000004a1a93 in schedule (p=<value optimized out>, calls=<value optimized out>) at beam/erl_process.c:6026
#3  0x000000000050eb2d in process_main () at beam/beam_emu.c:1161
#4  0x000000000049f322 in sched_thread_func (vesdp=<value optimized out>) at beam/erl_process.c:3060
#5  0x0000000000585064 in thr_wrapper (vtwd=<value optimized out>) at common/ethread.c:475
#6  0x0000003383c06367 in start_thread () from /lib64/libpthread.so.0
#7  0x00000033830d309d in clone () from /lib64/libc.so.6

Thread 74 (process 19036):
#0  0x0000003383c0a899 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x000000000049b6e2 in sched_cnd_wait (no=7, rq=0x2b035e3ba1b0) at beam/erl_threads.h:632
#2  0x00000000004a1a93 in schedule (p=<value optimized out>, calls=<value optimized out>) at beam/erl_process.c:6026
#3  0x000000000050eb2d in process_main () at beam/beam_emu.c:1161
#4  0x000000000049f322 in sched_thread_func (vesdp=<value optimized out>) at beam/erl_process.c:3060
#5  0x0000000000585064 in thr_wrapper (vtwd=<value optimized out>) at common/ethread.c:475
#6  0x0000003383c06367 in start_thread () from /lib64/libpthread.so.0
#7  0x00000033830d309d in clone () from /lib64/libc.so.6

Thread 73 (process 19035):
#0  0x0000003383c0a899 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x000000000049b6e2 in sched_cnd_wait (no=6, rq=0x2b035e3b9fb0) at beam/erl_threads.h:632
#2  0x00000000004a1a93 in schedule (p=<value optimized out>, calls=<value optimized out>) at beam/erl_process.c:6026
#3  0x000000000050eb2d in process_main () at beam/beam_emu.c:1161
#4  0x000000000049f322 in sched_thread_func (vesdp=<value optimized out>) at beam/erl_process.c:3060
#5  0x0000000000585064 in thr_wrapper (vtwd=<value optimized out>) at common/ethread.c:475
#6  0x0000003383c06367 in start_thread () from /lib64/libpthread.so.0
#7  0x00000033830d309d in clone () from /lib64/libc.so.6

Thread 72 (process 19034):
#0  0x0000003383c0a899 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x000000000049b6e2 in sched_cnd_wait (no=5, rq=0x2b035e3b9db0) at beam/erl_threads.h:632
#2  0x00000000004a1a93 in schedule (p=<value optimized out>, calls=<value optimized out>) at beam/erl_process.c:6026
#3  0x000000000050eb2d in process_main () at beam/beam_emu.c:1161
#4  0x000000000049f322 in sched_thread_func (vesdp=<value optimized out>) at beam/erl_process.c:3060
#5  0x0000000000585064 in thr_wrapper (vtwd=<value optimized out>) at common/ethread.c:475
#6  0x0000003383c06367 in start_thread () from /lib64/libpthread.so.0
#7  0x00000033830d309d in clone () from /lib64/libc.so.6

Thread 71 (process 19033):
#0  0x00000033830d3488 in epoll_wait () from /lib64/libc.so.6
#1  0x0000000000530f04 in erts_poll_wait_kp (ps=0x2b035e2f3d38, pr=0x44b25660, len=0x44b25e7c, utvp=<value optimized out>) at sys/common/erl_poll.c:1907
#2  0x0000000000533bdb in erts_check_io_kp (do_wait=<value optimized out>) at sys/common/erl_check_io.c:1156
#3  0x000000000049b896 in sched_sys_wait (no=4, rq=0x2b035e3b9bb0) at beam/erl_process.c:785
#4  0x00000000004a1dd2 in schedule (p=<value optimized out>, calls=<value optimized out>) at beam/erl_process.c:6020
#5  0x000000000050eb2d in process_main () at beam/beam_emu.c:1161
#6  0x000000000049f322 in sched_thread_func (vesdp=<value optimized out>) at beam/erl_process.c:3060
#7  0x0000000000585064 in thr_wrapper (vtwd=<value optimized out>) at common/ethread.c:475
#8  0x0000003383c06367 in start_thread () from /lib64/libpthread.so.0
#9  0x00000033830d309d in clone () from /lib64/libc.so.6

Thread 70 (process 19032):
#0  0x0000003383c0a899 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x000000000049b6e2 in sched_cnd_wait (no=3, rq=0x2b035e3b99b0) at beam/erl_threads.h:632
#2  0x00000000004a1a93 in schedule (p=<value optimized out>, calls=<value optimized out>) at beam/erl_process.c:6026
#3  0x000000000050eb2d in process_main () at beam/beam_emu.c:1161
#4  0x000000000049f322 in sched_thread_func (vesdp=<value optimized out>) at beam/erl_process.c:3060
#5  0x0000000000585064 in thr_wrapper (vtwd=<value optimized out>) at common/ethread.c:475
#6  0x0000003383c06367 in start_thread () from /lib64/libpthread.so.0
#7  0x00000033830d309d in clone () from /lib64/libc.so.6

Thread 69 (process 19031):
#0  0x0000003383c0a899 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x000000000049b6e2 in sched_cnd_wait (no=2, rq=0x2b035e3b97b0) at beam/erl_threads.h:632
#2  0x00000000004a1a93 in schedule (p=<value optimized out>, calls=<value optimized out>) at beam/erl_process.c:6026
#3  0x000000000050eb2d in process_main () at beam/beam_emu.c:1161
#4  0x000000000049f322 in sched_thread_func (vesdp=<value optimized out>) at beam/erl_process.c:3060
#5  0x0000000000585064 in thr_wrapper (vtwd=<value optimized out>) at common/ethread.c:475
#6  0x0000003383c06367 in start_thread () from /lib64/libpthread.so.0
#7  0x00000033830d309d in clone () from /lib64/libc.so.6

Thread 68 (process 19030):
#0  0x0000003383c0a899 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x000000000049b6e2 in sched_cnd_wait (no=1, rq=0x2b035e3b95b0) at beam/erl_threads.h:632
#2  0x00000000004a1a93 in schedule (p=<value optimized out>, calls=<value optimized out>) at beam/erl_process.c:6026
#3  0x000000000050eb2d in process_main () at beam/beam_emu.c:1161
#4  0x000000000049f322 in sched_thread_func (vesdp=<value optimized out>) at beam/erl_process.c:3060
#5  0x0000000000585064 in thr_wrapper (vtwd=<value optimized out>) at common/ethread.c:475
#6  0x0000003383c06367 in start_thread () from /lib64/libpthread.so.0
#7  0x00000033830d309d in clone () from /lib64/libc.so.6

Thread 67 (process 19029):
#0  0x0000003383c0a899 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00000000004e9bdf in async_main (arg=<value optimized out>) at beam/erl_threads.h:632
#2  0x0000000000585064 in thr_wrapper (vtwd=<value optimized out>) at common/ethread.c:475
#3  0x0000003383c06367 in start_thread () from /lib64/libpthread.so.0
#4  0x00000033830d309d in clone () from /lib64/libc.so.6

Thread 66 (process 19028):
#0  0x0000003383c0a899 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00000000004e9bdf in async_main (arg=<value optimized out>) at beam/erl_threads.h:632
#2  0x0000000000585064 in thr_wrapper (vtwd=<value optimized out>) at common/ethread.c:475
#3  0x0000003383c06367 in start_thread () from /lib64/libpthread.so.0
#4  0x00000033830d309d in clone () from /lib64/libc.so.6

Thread 65 (process 19027):
#0  0x0000003383c0a899 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00000000004e9bdf in async_main (arg=<value optimized out>) at beam/erl_threads.h:632
#2  0x0000000000585064 in thr_wrapper (vtwd=<value optimized out>) at common/ethread.c:475
#3  0x0000003383c06367 in start_thread () from /lib64/libpthread.so.0
#4  0x00000033830d309d in clone () from /lib64/libc.so.6

Thread 64 (process 19026):
#0  0x0000003383c0a899 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00000000004e9bdf in async_main (arg=<value optimized out>) at beam/erl_threads.h:632
#2  0x0000000000585064 in thr_wrapper (vtwd=<value optimized out>) at common/ethread.c:475
#3  0x0000003383c06367 in start_thread () from /lib64/libpthread.so.0
#4  0x00000033830d309d in clone () from /lib64/libc.so.6

Thread 63 (process 19025):
#0  0x0000003383c0a899 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00000000004e9bdf in async_main (arg=<value optimized out>) at beam/erl_threads.h:632
#2  0x0000000000585064 in thr_wrapper (vtwd=<value optimized out>) at common/ethread.c:475
#3  0x0000003383c06367 in start_thread () from /lib64/libpthread.so.0
#4  0x00000033830d309d in clone () from /lib64/libc.so.6

Thread 62 (process 19024):
#0  0x0000003383c0a899 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00000000004e9bdf in async_main (arg=<value optimized out>) at beam/erl_threads.h:632
#2  0x0000000000585064 in thr_wrapper (vtwd=<value optimized out>) at common/ethread.c:475
#3  0x0000003383c06367 in start_thread () from /lib64/libpthread.so.0
#4  0x00000033830d309d in clone () from /lib64/libc.so.6

Thread 61 (process 19023):
#0  0x0000003383c0a899 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00000000004e9bdf in async_main (arg=<value optimized out>) at beam/erl_threads.h:632
#2  0x0000000000585064 in thr_wrapper (vtwd=<value optimized out>) at common/ethread.c:475
#3  0x0000003383c06367 in start_thread () from /lib64/libpthread.so.0
#4  0x00000033830d309d in clone () from /lib64/libc.so.6

Thread 60 (process 19022):
#0  0x0000003383c0a899 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00000000004e9bdf in async_main (arg=<value optimized out>) at beam/erl_threads.h:632
#2  0x0000000000585064 in thr_wrapper (vtwd=<value optimized out>) at common/ethread.c:475
#3  0x0000003383c06367 in start_thread () from /lib64/libpthread.so.0
#4  0x00000033830d309d in clone () from /lib64/libc.so.6

Thread 59 (process 19021):
#0  0x0000003383c0a899 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00000000004e9bdf in async_main (arg=<value optimized out>) at beam/erl_threads.h:632
#2  0x0000000000585064 in thr_wrapper (vtwd=<value optimized out>) at common/ethread.c:475
#3  0x0000003383c06367 in start_thread () from /lib64/libpthread.so.0
#4  0x00000033830d309d in clone () from /lib64/libc.so.6

Thread 58 (process 19020):
#0  0x0000003383c0a899 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00000000004e9bdf in async_main (arg=<value optimized out>) at beam/erl_threads.h:632
#2  0x0000000000585064 in thr_wrapper (vtwd=<value optimized out>) at common/ethread.c:475
#3  0x0000003383c06367 in start_thread () from /lib64/libpthread.so.0
#4  0x00000033830d309d in clone () from /lib64/libc.so.6

Thread 57 (process 19019):
#0  0x0000003383c0a899 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00000000004e9bdf in async_main (arg=<value optimized out>) at beam/erl_threads.h:632
#2  0x0000000000585064 in thr_wrapper (vtwd=<value optimized out>) at common/ethread.c:475
#3  0x0000003383c06367 in start_thread () from /lib64/libpthread.so.0
#4  0x00000033830d309d in clone () from /lib64/libc.so.6

Thread 56 (process 19018):
#0  0x0000003383c0a899 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00000000004e9bdf in async_main (arg=<value optimized out>) at beam/erl_threads.h:632
#2  0x0000000000585064 in thr_wrapper (vtwd=<value optimized out>) at common/ethread.c:475
#3  0x0000003383c06367 in start_thread () from /lib64/libpthread.so.0
#4  0x00000033830d309d in clone () from /lib64/libc.so.6

Thread 55 (process 19017):
#0  0x0000003383c0a899 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00000000004e9bdf in async_main (arg=<value optimized out>) at beam/erl_threads.h:632
#2  0x0000000000585064 in thr_wrapper (vtwd=<value optimized out>) at common/ethread.c:475
#3  0x0000003383c06367 in start_thread () from /lib64/libpthread.so.0
#4  0x00000033830d309d in clone () from /lib64/libc.so.6

Thread 54 (process 19016):
#0  0x0000003383c0a899 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00000000004e9bdf in async_main (arg=<value optimized out>) at beam/erl_threads.h:632
#2  0x0000000000585064 in thr_wrapper (vtwd=<value optimized out>) at common/ethread.c:475
#3  0x0000003383c06367 in start_thread () from /lib64/libpthread.so.0
#4  0x00000033830d309d in clone () from /lib64/libc.so.6

Thread 53 (process 19015):
#0  0x0000003383c0a899 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00000000004e9bdf in async_main (arg=<value optimized out>) at beam/erl_threads.h:632
#2  0x0000000000585064 in thr_wrapper (vtwd=<value optimized out>) at common/ethread.c:475
#3  0x0000003383c06367 in start_thread () from /lib64/libpthread.so.0
#4  0x00000033830d309d in clone () from /lib64/libc.so.6

Thread 52 (process 19014):
#0  0x0000003383c0a899 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00000000004e9bdf in async_main (arg=<value optimized out>) at beam/erl_threads.h:632
#2  0x0000000000585064 in thr_wrapper (vtwd=<value optimized out>) at common/ethread.c:475
#3  0x0000003383c06367 in start_thread () from /lib64/libpthread.so.0
#4  0x00000033830d309d in clone () from /lib64/libc.so.6

Thread 51 (process 19013):
#0  0x0000003383c0a899 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00000000004e9bdf in async_main (arg=<value optimized out>) at beam/erl_threads.h:632
#2  0x0000000000585064 in thr_wrapper (vtwd=<value optimized out>) at common/ethread.c:475
#3  0x0000003383c06367 in start_thread () from /lib64/libpthread.so.0
#4  0x00000033830d309d in clone () from /lib64/libc.so.6

Thread 50 (process 19012):
#0  0x0000003383c0a899 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00000000004e9bdf in async_main (arg=<value optimized out>) at beam/erl_threads.h:632
#2  0x0000000000585064 in thr_wrapper (vtwd=<value optimized out>) at common/ethread.c:475
#3  0x0000003383c06367 in start_thread () from /lib64/libpthread.so.0
#4  0x00000033830d309d in clone () from /lib64/libc.so.6

Thread 49 (process 19011):
#0  0x0000003383c0a899 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00000000004e9bdf in async_main (arg=<value optimized out>) at beam/erl_threads.h:632
#2  0x0000000000585064 in thr_wrapper (vtwd=<value optimized out>) at common/ethread.c:475
#3  0x0000003383c06367 in start_thread () from /lib64/libpthread.so.0
#4  0x00000033830d309d in clone () from /lib64/libc.so.6

Thread 48 (process 19010):
#0  0x0000003383c0a899 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00000000004e9bdf in async_main (arg=<value optimized out>) at beam/erl_threads.h:632
#2  0x0000000000585064 in thr_wrapper (vtwd=<value optimized out>) at common/ethread.c:475
#3  0x0000003383c06367 in start_thread () from /lib64/libpthread.so.0
#4  0x00000033830d309d in clone () from /lib64/libc.so.6

Thread 47 (process 19009):
#0  0x0000003383c0a899 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00000000004e9bdf in async_main (arg=<value optimized out>) at beam/erl_threads.h:632
#2  0x0000000000585064 in thr_wrapper (vtwd=<value optimized out>) at common/ethread.c:475
#3  0x0000003383c06367 in start_thread () from /lib64/libpthread.so.0
#4  0x00000033830d309d in clone () from /lib64/libc.so.6

Thread 46 (process 19008):
#0  0x0000003383c0a899 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00000000004e9bdf in async_main (arg=<value optimized out>) at beam/erl_threads.h:632
#2  0x0000000000585064 in thr_wrapper (vtwd=<value optimized out>) at common/ethread.c:475
#3  0x0000003383c06367 in start_thread () from /lib64/libpthread.so.0
#4  0x00000033830d309d in clone () from /lib64/libc.so.6

Thread 45 (process 19007):
#0  0x0000003383c0a899 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00000000004e9bdf in async_main (arg=<value optimized out>) at beam/erl_threads.h:632
#2  0x0000000000585064 in thr_wrapper (vtwd=<value optimized out>) at common/ethread.c:475
#3  0x0000003383c06367 in start_thread () from /lib64/libpthread.so.0
#4  0x00000033830d309d in clone () from /lib64/libc.so.6

Thread 44 (process 19006):
#0  0x0000003383c0a899 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00000000004e9bdf in async_main (arg=<value optimized out>) at beam/erl_threads.h:632
#2  0x0000000000585064 in thr_wrapper (vtwd=<value optimized out>) at common/ethread.c:475
#3  0x0000003383c06367 in start_thread () from /lib64/libpthread.so.0
#4  0x00000033830d309d in clone () from /lib64/libc.so.6

Thread 43 (process 19005):
#0  0x0000003383c0a899 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00000000004e9bdf in async_main (arg=<value optimized out>) at beam/erl_threads.h:632
#2  0x0000000000585064 in thr_wrapper (vtwd=<value optimized out>) at common/ethread.c:475
#3  0x0000003383c06367 in start_thread () from /lib64/libpthread.so.0
#4  0x00000033830d309d in clone () from /lib64/libc.so.6

Thread 42 (process 19004):
#0  0x0000003383c0a899 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00000000004e9bdf in async_main (arg=<value optimized out>) at beam/erl_threads.h:632
#2  0x0000000000585064 in thr_wrapper (vtwd=<value optimized out>) at common/ethread.c:475
#3  0x0000003383c06367 in start_thread () from /lib64/libpthread.so.0
#4  0x00000033830d309d in clone () from /lib64/libc.so.6

Thread 41 (process 19003):
#0  0x0000003383c0a899 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00000000004e9bdf in async_main (arg=<value optimized out>) at beam/erl_threads.h:632
#2  0x0000000000585064 in thr_wrapper (vtwd=<value optimized out>) at common/ethread.c:475
#3  0x0000003383c06367 in start_thread () from /lib64/libpthread.so.0
#4  0x00000033830d309d in clone () from /lib64/libc.so.6

Thread 40 (process 19002):
#0  0x0000003383c0a899 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00000000004e9bdf in async_main (arg=<value optimized out>) at beam/erl_threads.h:632
#2  0x0000000000585064 in thr_wrapper (vtwd=<value optimized out>) at common/ethread.c:475
#3  0x0000003383c06367 in start_thread () from /lib64/libpthread.so.0
#4  0x00000033830d309d in clone () from /lib64/libc.so.6

Thread 39 (process 19001):
#0  0x0000003383c0a899 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00000000004e9bdf in async_main (arg=<value optimized out>) at beam/erl_threads.h:632
#2  0x0000000000585064 in thr_wrapper (vtwd=<value optimized out>) at common/ethread.c:475
#3  0x0000003383c06367 in start_thread () from /lib64/libpthread.so.0
#4  0x00000033830d309d in clone () from /lib64/libc.so.6

Thread 38 (process 19000):
#0  0x0000003383c0a899 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00000000004e9bdf in async_main (arg=<value optimized out>) at beam/erl_threads.h:632
#2  0x0000000000585064 in thr_wrapper (vtwd=<value optimized out>) at common/ethread.c:475
#3  0x0000003383c06367 in start_thread () from /lib64/libpthread.so.0
#4  0x00000033830d309d in clone () from /lib64/libc.so.6

Thread 37 (process 18999):
#0  0x0000003383c0a899 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00000000004e9bdf in async_main (arg=<value optimized out>) at beam/erl_threads.h:632
#2  0x0000000000585064 in thr_wrapper (vtwd=<value optimized out>) at common/ethread.c:475
#3  0x0000003383c06367 in start_thread () from /lib64/libpthread.so.0
#4  0x00000033830d309d in clone () from /lib64/libc.so.6

Thread 36 (process 18998):
#0  0x0000003383c0a899 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00000000004e9bdf in async_main (arg=<value optimized out>) at beam/erl_threads.h:632
#2  0x0000000000585064 in thr_wrapper (vtwd=<value optimized out>) at common/ethread.c:475
#3  0x0000003383c06367 in start_thread () from /lib64/libpthread.so.0
#4  0x00000033830d309d in clone () from /lib64/libc.so.6

Thread 35 (process 18997):
#0  0x0000003383c0a899 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00000000004e9bdf in async_main (arg=<value optimized out>) at beam/erl_threads.h:632
#2  0x0000000000585064 in thr_wrapper (vtwd=<value optimized out>) at common/ethread.c:475
#3  0x0000003383c06367 in start_thread () from /lib64/libpthread.so.0
#4  0x00000033830d309d in clone () from /lib64/libc.so.6

Thread 34 (process 18996):
#0  0x0000003383c0a899 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00000000004e9bdf in async_main (arg=<value optimized out>) at beam/erl_threads.h:632
#2  0x0000000000585064 in thr_wrapper (vtwd=<value optimized out>) at common/ethread.c:475
#3  0x0000003383c06367 in start_thread () from /lib64/libpthread.so.0
#4  0x00000033830d309d in clone () from /lib64/libc.so.6

Thread 33 (process 18995):
#0  0x0000003383c0a899 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00000000004e9bdf in async_main (arg=<value optimized out>) at beam/erl_threads.h:632
#2  0x0000000000585064 in thr_wrapper (vtwd=<value optimized out>) at common/ethread.c:475
#3  0x0000003383c06367 in start_thread () from /lib64/libpthread.so.0
#4  0x00000033830d309d in clone () from /lib64/libc.so.6

Thread 32 (process 18994):
#0  0x0000003383c0a899 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00000000004e9bdf in async_main (arg=<value optimized out>) at beam/erl_threads.h:632
#2  0x0000000000585064 in thr_wrapper (vtwd=<value optimized out>) at common/ethread.c:475
#3  0x0000003383c06367 in start_thread () from /lib64/libpthread.so.0
#4  0x00000033830d309d in clone () from /lib64/libc.so.6

Thread 31 (process 18993):
#0  0x0000003383c0a899 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00000000004e9bdf in async_main (arg=<value optimized out>) at beam/erl_threads.h:632
#2  0x0000000000585064 in thr_wrapper (vtwd=<value optimized out>) at common/ethread.c:475
#3  0x0000003383c06367 in start_thread () from /lib64/libpthread.so.0
#4  0x00000033830d309d in clone () from /lib64/libc.so.6

Thread 30 (process 18992):
#0  0x0000003383c0a899 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00000000004e9bdf in async_main (arg=<value optimized out>) at beam/erl_threads.h:632
#2  0x0000000000585064 in thr_wrapper (vtwd=<value optimized out>) at common/ethread.c:475
#3  0x0000003383c06367 in start_thread () from /lib64/libpthread.so.0
#4  0x00000033830d309d in clone () from /lib64/libc.so.6

Thread 29 (process 18991):
#0  0x0000003383c0a899 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00000000004e9bdf in async_main (arg=<value optimized out>) at beam/erl_threads.h:632
#2  0x0000000000585064 in thr_wrapper (vtwd=<value optimized out>) at common/ethread.c:475
#3  0x0000003383c06367 in start_thread () from /lib64/libpthread.so.0
#4  0x00000033830d309d in clone () from /lib64/libc.so.6

Thread 28 (process 18990):
#0  0x0000003383c0a899 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00000000004e9bdf in async_main (arg=<value optimized out>) at beam/erl_threads.h:632
#2  0x0000000000585064 in thr_wrapper (vtwd=<value optimized out>) at common/ethread.c:475
#3  0x0000003383c06367 in start_thread () from /lib64/libpthread.so.0
#4  0x00000033830d309d in clone () from /lib64/libc.so.6

Thread 27 (process 18989):
#0  0x0000003383c0a899 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00000000004e9bdf in async_main (arg=<value optimized out>) at beam/erl_threads.h:632
#2  0x0000000000585064 in thr_wrapper (vtwd=<value optimized out>) at common/ethread.c:475
#3  0x0000003383c06367 in start_thread () from /lib64/libpthread.so.0
#4  0x00000033830d309d in clone () from /lib64/libc.so.6

Thread 26 (process 18988):
#0  0x0000003383c0a899 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00000000004e9bdf in async_main (arg=<value optimized out>) at beam/erl_threads.h:632
#2  0x0000000000585064 in thr_wrapper (vtwd=<value optimized out>) at common/ethread.c:475
#3  0x0000003383c06367 in start_thread () from /lib64/libpthread.so.0
#4  0x00000033830d309d in clone () from /lib64/libc.so.6

Thread 25 (process 18987):
#0  0x0000003383c0a899 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00000000004e9bdf in async_main (arg=<value optimized out>) at beam/erl_threads.h:632
#2  0x0000000000585064 in thr_wrapper (vtwd=<value optimized out>) at common/ethread.c:475
#3  0x0000003383c06367 in start_thread () from /lib64/libpthread.so.0
#4  0x00000033830d309d in clone () from /lib64/libc.so.6

Thread 24 (process 18986):
#0  0x0000003383c0a899 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00000000004e9bdf in async_main (arg=<value optimized out>) at beam/erl_threads.h:632
#2  0x0000000000585064 in thr_wrapper (vtwd=<value optimized out>) at common/ethread.c:475
#3  0x0000003383c06367 in start_thread () from /lib64/libpthread.so.0
#4  0x00000033830d309d in clone () from /lib64/libc.so.6

Thread 23 (process 18985):
#0  0x0000003383c0a899 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00000000004e9bdf in async_main (arg=<value optimized out>) at beam/erl_threads.h:632
#2  0x0000000000585064 in thr_wrapper (vtwd=<value optimized out>) at common/ethread.c:475
#3  0x0000003383c06367 in start_thread () from /lib64/libpthread.so.0
#4  0x00000033830d309d in clone () from /lib64/libc.so.6

Thread 22 (process 18984):
#0  0x0000003383c0a899 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00000000004e9bdf in async_main (arg=<value optimized out>) at beam/erl_threads.h:632
#2  0x0000000000585064 in thr_wrapper (vtwd=<value optimized out>) at common/ethread.c:475
#3  0x0000003383c06367 in start_thread () from /lib64/libpthread.so.0
#4  0x00000033830d309d in clone () from /lib64/libc.so.6

Thread 21 (process 18983):
#0  0x0000003383c0a899 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00000000004e9bdf in async_main (arg=<value optimized out>) at beam/erl_threads.h:632
#2  0x0000000000585064 in thr_wrapper (vtwd=<value optimized out>) at common/ethread.c:475
#3  0x0000003383c06367 in start_thread () from /lib64/libpthread.so.0
#4  0x00000033830d309d in clone () from /lib64/libc.so.6

Thread 20 (process 18982):
#0  0x0000003383c0a899 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00000000004e9bdf in async_main (arg=<value optimized out>) at beam/erl_threads.h:632
#2  0x0000000000585064 in thr_wrapper (vtwd=<value optimized out>) at common/ethread.c:475
#3  0x0000003383c06367 in start_thread () from /lib64/libpthread.so.0
#4  0x00000033830d309d in clone () from /lib64/libc.so.6

Thread 19 (process 18981):
#0  0x0000003383c0a899 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00000000004e9bdf in async_main (arg=<value optimized out>) at beam/erl_threads.h:632
#2  0x0000000000585064 in thr_wrapper (vtwd=<value optimized out>) at common/ethread.c:475
#3  0x0000003383c06367 in start_thread () from /lib64/libpthread.so.0
#4  0x00000033830d309d in clone () from /lib64/libc.so.6

Thread 18 (process 18980):
#0  0x0000003383c0a899 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00000000004e9bdf in async_main (arg=<value optimized out>) at beam/erl_threads.h:632
#2  0x0000000000585064 in thr_wrapper (vtwd=<value optimized out>) at common/ethread.c:475
#3  0x0000003383c06367 in start_thread () from /lib64/libpthread.so.0
#4  0x00000033830d309d in clone () from /lib64/libc.so.6

Thread 17 (process 18979):
#0  0x0000003383c0a899 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00000000004e9bdf in async_main (arg=<value optimized out>) at beam/erl_threads.h:632
#2  0x0000000000585064 in thr_wrapper (vtwd=<value optimized out>) at common/ethread.c:475
#3  0x0000003383c06367 in start_thread () from /lib64/libpthread.so.0
#4  0x00000033830d309d in clone () from /lib64/libc.so.6

Thread 16 (process 18978):
#0  0x0000003383c0a899 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00000000004e9bdf in async_main (arg=<value optimized out>) at beam/erl_threads.h:632
#2  0x0000000000585064 in thr_wrapper (vtwd=<value optimized out>) at common/ethread.c:475
#3  0x0000003383c06367 in start_thread () from /lib64/libpthread.so.0
#4  0x00000033830d309d in clone () from /lib64/libc.so.6

Thread 15 (process 18977):
#0  0x0000003383c0a899 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00000000004e9bdf in async_main (arg=<value optimized out>) at beam/erl_threads.h:632
#2  0x0000000000585064 in thr_wrapper (vtwd=<value optimized out>) at common/ethread.c:475
#3  0x0000003383c06367 in start_thread () from /lib64/libpthread.so.0
#4  0x00000033830d309d in clone () from /lib64/libc.so.6

Thread 14 (process 18976):
#0  0x0000003383c0a899 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00000000004e9bdf in async_main (arg=<value optimized out>) at beam/erl_threads.h:632
#2  0x0000000000585064 in thr_wrapper (vtwd=<value optimized out>) at common/ethread.c:475
#3  0x0000003383c06367 in start_thread () from /lib64/libpthread.so.0
#4  0x00000033830d309d in clone () from /lib64/libc.so.6

Thread 13 (process 18975):
#0  0x0000003383c0a899 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00000000004e9bdf in async_main (arg=<value optimized out>) at beam/erl_threads.h:632
#2  0x0000000000585064 in thr_wrapper (vtwd=<value optimized out>) at common/ethread.c:475
#3  0x0000003383c06367 in start_thread () from /lib64/libpthread.so.0
#4  0x00000033830d309d in clone () from /lib64/libc.so.6

Thread 12 (process 18974):
#0  0x0000003383c0a899 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00000000004e9bdf in async_main (arg=<value optimized out>) at beam/erl_threads.h:632
#2  0x0000000000585064 in thr_wrapper (vtwd=<value optimized out>) at common/ethread.c:475
#3  0x0000003383c06367 in start_thread () from /lib64/libpthread.so.0
#4  0x00000033830d309d in clone () from /lib64/libc.so.6

Thread 11 (process 18973):
#0  0x0000003383c0a899 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00000000004e9bdf in async_main (arg=<value optimized out>) at beam/erl_threads.h:632
#2  0x0000000000585064 in thr_wrapper (vtwd=<value optimized out>) at common/ethread.c:475
#3  0x0000003383c06367 in start_thread () from /lib64/libpthread.so.0
#4  0x00000033830d309d in clone () from /lib64/libc.so.6

Thread 10 (process 18972):
#0  0x0000003383c0a899 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00000000004e9bdf in async_main (arg=<value optimized out>) at beam/erl_threads.h:632
#2  0x0000000000585064 in thr_wrapper (vtwd=<value optimized out>) at common/ethread.c:475
#3  0x0000003383c06367 in start_thread () from /lib64/libpthread.so.0
#4  0x00000033830d309d in clone () from /lib64/libc.so.6

Thread 9 (process 18971):
#0  0x0000003383c0a899 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00000000004e9bdf in async_main (arg=<value optimized out>) at beam/erl_threads.h:632
#2  0x0000000000585064 in thr_wrapper (vtwd=<value optimized out>) at common/ethread.c:475
#3  0x0000003383c06367 in start_thread () from /lib64/libpthread.so.0
#4  0x00000033830d309d in clone () from /lib64/libc.so.6

Thread 8 (process 18970):
#0  0x0000003383c0a899 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00000000004e9bdf in async_main (arg=<value optimized out>) at beam/erl_threads.h:632
#2  0x0000000000585064 in thr_wrapper (vtwd=<value optimized out>) at common/ethread.c:475
#3  0x0000003383c06367 in start_thread () from /lib64/libpthread.so.0
#4  0x00000033830d309d in clone () from /lib64/libc.so.6

Thread 7 (process 18969):
#0  0x0000003383c0a899 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00000000004e9bdf in async_main (arg=<value optimized out>) at beam/erl_threads.h:632
#2  0x0000000000585064 in thr_wrapper (vtwd=<value optimized out>) at common/ethread.c:475
#3  0x0000003383c06367 in start_thread () from /lib64/libpthread.so.0
#4  0x00000033830d309d in clone () from /lib64/libc.so.6

Thread 6 (process 18968):
#0  0x0000003383c0a899 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00000000004e9bdf in async_main (arg=<value optimized out>) at beam/erl_threads.h:632
#2  0x0000000000585064 in thr_wrapper (vtwd=<value optimized out>) at common/ethread.c:475
#3  0x0000003383c06367 in start_thread () from /lib64/libpthread.so.0
#4  0x00000033830d309d in clone () from /lib64/libc.so.6

Thread 5 (process 18967):
#0  0x0000003383c0a899 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00000000004e9bdf in async_main (arg=<value optimized out>) at beam/erl_threads.h:632
#2  0x0000000000585064 in thr_wrapper (vtwd=<value optimized out>) at common/ethread.c:475
#3  0x0000003383c06367 in start_thread () from /lib64/libpthread.so.0
#4  0x00000033830d309d in clone () from /lib64/libc.so.6

Thread 4 (process 18966):
#0  0x0000003383c0a899 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00000000004e9bdf in async_main (arg=<value optimized out>) at beam/erl_threads.h:632
#2  0x0000000000585064 in thr_wrapper (vtwd=<value optimized out>) at common/ethread.c:475
#3  0x0000003383c06367 in start_thread () from /lib64/libpthread.so.0
#4  0x00000033830d309d in clone () from /lib64/libc.so.6

Thread 3 (process 18965):
#0  0x0000003383c0a899 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x000000000046df1c in sys_msg_dispatcher_func (unused=<value optimized out>) at beam/erl_threads.h:632
#2  0x0000000000585064 in thr_wrapper (vtwd=<value optimized out>) at common/ethread.c:475
#3  0x0000003383c06367 in start_thread () from /lib64/libpthread.so.0
#4  0x00000033830d309d in clone () from /lib64/libc.so.6

Thread 2 (process 18964):
#0  0x0000003383c0a899 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x0000000000471ee4 in emergency_watchdog (unused=<value optimized out>) at beam/erl_threads.h:632
#2  0x0000000000585064 in thr_wrapper (vtwd=<value optimized out>) at common/ethread.c:475
#3  0x0000003383c06367 in start_thread () from /lib64/libpthread.so.0
#4  0x00000033830d309d in clone () from /lib64/libc.so.6

Thread 1 (process 18963):
#0  0x0000003383c0d2cb in read () from /lib64/libpthread.so.0
#1  0x000000000052b51e in signal_dispatcher_thread_func (unused=<value optimized out>) at sys/unix/sys.c:2913
#2  0x0000000000585064 in thr_wrapper (vtwd=<value optimized out>) at common/ethread.c:475
#3  0x0000003383c06367 in start_thread () from /lib64/libpthread.so.0
#4  0x00000033830d309d in clone () from /lib64/libc.so.6
(gdb) 

From fritchie@REDACTED  Sun May 16 02:07:01 2010
From: fritchie@REDACTED (Scott Lystig Fritchie)
Date: Sat, 15 May 2010 19:07:01 -0500
Subject: [erlang-bugs] net_kernel hang, perhaps blocked by busy_dist_port race? 
In-Reply-To: Message of "Sat, 15 May 2010 16:01:16 CDT."
             <63794.1273957276@snookles.snookles.com> 
Message-ID: <72573.1273968421@snookles.snookles.com>

Following up on my previous message ... I've been able to duplicate
this bug, whee!  I'm going to try to create a mostly-automatable recipe
to make it easier for others to try to reproduce.

-Scott

Got msg {monitor,<0.21.0>,busy_dist_port,#Port<0.547>}
Got msg {monitor,<0.40.0>,busy_dist_port,#Port<0.547>}
Got msg {monitor,<0.40.0>,busy_dist_port,#Port<0.547>}
Got msg {monitor,<0.21.0>,busy_dist_port,#Port<0.547>}
Got msg {nodedown,goofus@REDACTED,
                  [{nodedown_reason,connection_closed},{node_type,visible}]}

User switch command
 --> s
 --> c
Eshell V5.7.5  (abort with ^G)
(bar@REDACTED)1> whereis(net_kernel).
<0.21.0>

(bar@REDACTED)2> process_info(whereis(net_kernel)).
[{registered_name,net_kernel},
 {current_function,{erlang,bif_return_trap,1}},
 {initial_call,{proc_lib,init_p,5}},
 {status,suspended},
 {message_queue_len,147},
 {messages,[tick,tick,tick,tick,
            {'EXIT',<0.57.0>,connection_closed},
            tick,tick,tick,tick,tick,tick,
            {accept,<0.22.0>,#Port<0.549>,inet,tcp},
            tick,tick,tick,tick,tick,tick,tick,tick,tick|...]},
 {links,[<0.23.0>,<0.75.0>,<0.18.0>,<0.22.0>,#Port<0.62>]},
 {dictionary,[{'$ancestors',[net_sup,kernel_sup,<0.9.0>]},
              {longnames,false},
              {'$initial_call',{net_kernel,init,1}}]},
 {trap_exit,true},
 {error_handler,error_handler},
 {priority,max},
 {group_leader,<0.8.0>},
 {total_heap_size,1974},
 {heap_size,1597},
 {stack_size,12},
 {reductions,502181},
 {garbage_collection,[{min_bin_vheap_size,46368},
                      {min_heap_size,233},
                      {fullsweep_after,65535},
                      {minor_gcs,825}]},
 {suspending,[]}]

(bar@REDACTED)3> Bt1 = process_info(whereis(net_kernel), backtrace).
<<"...">>

(bar@REDACTED)6> io:format("~s\n", [element(2,Bt1)]).     
Program counter: 0x08243388 (unknown function)
CP: 0xb76cd7b4 (gen_server:reply/2 + 104)
arity = 1
   {#Ref<6666.0.0.35773>,yes}

0xb48893ec Return addr 0xb76cf9f4 (gen_server:handle_msg/5 + 424)
y(0)     Catch 0xb76cd7b4 (gen_server:reply/2 + 104)

0xb48893f4 Return addr 0xb76a5258 (proc_lib:init_p_do_apply/3 + 28)
y(0)     net_kernel
y(1)     []
y(2)     net_kernel
y(3)     <0.18.0>
y(4)     []
y(5)     []
y(6)     {state,bar,'bar@REDACTED',shortnames,{tick,<0.23.0>,5000},7000,sys_dist,[{<0.75.0>,'foo@REDACTED'},{<0.57.0>,'goofus@REDACTED'}],[],[{listen,#Port<0.62>,<0.22.0>,{net_address,{{0,0,0,0},48326},"bb3",tcp,inet},inet_tcp_dist}],[],0,all}

0xb4889414 Return addr 0x0824852c (<terminate process normally>)
y(0)     Catch 0xb76a5268 (proc_lib:init_p_do_apply/3 + 44)

ok

From fritchie@REDACTED  Sun May 16 04:51:59 2010
From: fritchie@REDACTED (Scott Lystig Fritchie)
Date: Sat, 15 May 2010 21:51:59 -0500
Subject: [erlang-bugs] net_kernel hang, perhaps blocked by busy_dist_port race? 
In-Reply-To: Message of "Sat, 15 May 2010 16:01:16 CDT."
             <63794.1273957276@snookles.snookles.com> 
Message-ID: <80213.1273978319@snookles.snookles.com>

New update: recipe to duplicate.

-Scott

This recipe works for:

    Erlang/OTP R13B04 on Linux kernel 2.6.27.41-170.2.117.fc10.i686
    Erlang/OTP R13B03 on same
    Erlang/OTP R13B02 on same
    Erlang/OTP R13B01 on same
    Erlang/OTP R12B-5 on same
    Erlang/OTP R11B-5 on same

The recipe requires a bit of luck and human intervention (pressing
Control-z at the right moment).  But I can get the error to happen
within a few minute's worth of trying.

Step #1, in terminal #1:

Run the following command:

    erl -sname foo1 -kernel net_ticktime 20 -eval 'register(foo, self()), [net_adm:ping(bar1@REDACTED) || _ <- lists:seq(1,100000)], erlang:display(done).'

Step #2, in terminal #2:

Run the following command:

    erl -sname bar1 -kernel net_ticktime 20 -eval 'F = fun(Ff) -> receive X -> io:format("Got msg ~p\n", [X]), Ff(Ff) end end, spawn(fun() -> io:format("I am: ~p\n", [self()]), erlang:system_monitor(self(), [busy_port, busy_dist_port]), net_kernel:monitor_nodes(true, [{node_type, visible}, nodedown_reason]), F(F) end), L1m = lists:seq(1,1000000), [{foo, foo1@REDACTED} ! {bar, L1m} || _ <- lists:seq(1,555000)], erlang:display(done).'

As soon as you start seeting these messages in terminal #2:

    Got msg {monitor,<0.2.0>,busy_dist_port,#Port<0.98>}
    Got msg {monitor,<0.2.0>,busy_dist_port,#Port<0.98>}
    Got msg {monitor,<0.21.0>,busy_dist_port,#Port<0.98>}

... then you're ready for step #3.

NOTE: On my machine, pid <0.2.0> is the process that is executing the
code in the "-eval" flag, and pid <0.21.0> is the 'net_kernel' process.

NOTE: Using different releases of Erlang/OTP, the 'net_kernel' pid may
vary slightly, but it's in the 20's.

Step #3, in terminal #1:

When you see the {monitor,<0.21.0>,...} message in terminal #2, press
Control-Z in terminal #1.  If that message is the most recent/last
message, wait for 20 seconds or more.  You probably will not see a
'{nodedown,...}' message in terminal #2.

If you weren't fast or lucky enough, type "fg" here in terminal #1 and
then press Control-z again when you're feeling fast or lucky.  You're
sending big messages over to the terminal #1 node, so if you let this
run for too long, you'll run out of memory over there and crash.

Step #4, in terminal #3:

    erl -sname goofus1 -kernel net_ticktime 20 -remsh bar1@REDACTED

If you get an error in 10 seconds or less, congratulations!

    Erlang R13B04 (erts-5.7.5) [source] [smp:2:2] [rq:2] [async-threads:0] [hipe] [kernel-poll:false]
    
    *** ERROR: Shell process terminated! (^G to start new job) ***

Step #5, in terminal #2:

Press Control-g, then type the following at the prompts:

    --> s
    --> c 2
    (bar1@REDACTED)1> process_info(whereis(net_kernel)).
    (bar1@REDACTED)3> io:format("~s\n", [element(2,process_info(whereis(net_kernel), backtrace))]). 

In one of my attempts, I managed to get lucky and also managed to
block the 'global_name_server' process, <0.12.0>.

    (bar1@REDACTED)2> process_info(pid(0,12,0)).        
    [{registered_name,global_name_server},
     {current_function,{erlang,bif_return_trap,1}},
     {initial_call,{proc_lib,init_p,5}},
     {status,suspended},
     {message_queue_len,0},
     {messages,[]},
     {links,[<0.13.0>,<0.14.0>,<0.15.0>,<0.10.0>]},
     {dictionary,[{{prot_vsn,foo1@REDACTED},5},
                  {{sync_tag_my,foo1@REDACTED},{1273,970221,319805}},
                  {'$ancestors',[kernel_sup,<0.9.0>]},
                  {{sync_tag_his,foo1@REDACTED},{1273,970221,329161}},
                  {'$initial_call',{global,init,1}}]},
     {trap_exit,true},
     {error_handler,error_handler},
     {priority,normal},
     {group_leader,<0.8.0>},
     {total_heap_size,987},
     {heap_size,610},
     {stack_size,24},
     {reductions,789},
     {garbage_collection,[{min_bin_vheap_size,46368},
                          {min_heap_size,233},
                          {fullsweep_after,65535},
                          {minor_gcs,5}]},
     {suspending,[]}]

    (bar1@REDACTED)4> io:format("~s\n", [element(2,process_info(pid(0,12,0), backtrace))]).        
    Program counter: 0x08243388 (unknown function)
    CP: 0xb7704738 (global:do_monitor/1 + 172)
    arity = 1
       #Ref<0.0.0.265>
    
    0xb748ab40 Return addr 0xb76fe8ec (global:insert_lock/4 + 100)
    y(0)     []
    y(1)     <4514.13.0>
    
    0xb748ab4c Return addr 0xb76fe704 (global:handle_set_lock/3 + 236)
    y(0)     []
    y(1)     []
    y(2)     {state,true,[],[],[{'foo1@REDACTED',{1273,970221,319805},<0.61.0>}],[],'nonode@REDACTED',<0.13.0>,<0.14.0>,<0.15.0>,no_trace,false}
    y(3)     [{<0.13.0>,<0.13.0>,#Ref<0.0.0.259>}]
    y(4)     <4514.13.0>
    y(5)     {global,[<0.13.0>,<4514.13.0>]}
    y(6)     [<0.13.0>,<4514.13.0>]
    y(7)     global
    
    0xb748ab70 Return addr 0xb76fa488 (global:handle_call/3 + 184)
    
    0xb748ab74 Return addr 0xb769f8e4 (gen_server:handle_msg/5 + 152)
    
    0xb748ab78 Return addr 0xb7675258 (proc_lib:init_p_do_apply/3 + 28)
    y(0)     global
    y(1)     {state,true,[],[],[{'foo1@REDACTED',{1273,970221,319805},<0.61.0>}],[],'nonode@REDACTED',<0.13.0>,<0.14.0>,<0.15.0>,no_trace,false}
    y(2)     global_name_server
    y(3)     <0.10.0>
    y(4)     {set_lock,{global,[<0.13.0>,<4514.13.0>]}}
    y(5)     {<4514.13.0>,{#Ref<4514.0.0.20846>,'bar1@REDACTED'}}
    y(6)     Catch 0xb769f8e4 (gen_server:handle_msg/5 + 152)
    
    0xb748ab98 Return addr 0x0824852c (<terminate process normally>)
    y(0)     Catch 0xb7675268 (proc_lib:init_p_do_apply/3 + 44)
    
    ok

From fritchie@REDACTED  Sun May 16 09:07:53 2010
From: fritchie@REDACTED (Scott Lystig Fritchie)
Date: Sun, 16 May 2010 02:07:53 -0500
Subject: [erlang-bugs] net_kernel hang, perhaps blocked by busy_dist_port race? 
In-Reply-To: Message of "Sat, 15 May 2010 21:51:59 CDT."
             <80213.1273978319@snookles.snookles.com> 
Message-ID: <92494.1273993673@snookles.snookles.com>

Scott Lystig Fritchie <fritchie@REDACTED> wrote:

slf> New update: recipe to duplicate.

Nothing like replying to myself again ... so, here's a kludge fix:
Allow 'max' priority processes (such as 'net_kernel') to send messages
(well, queue them really) on busy distribution ports.

--- dist.c	2009-11-20 07:29:24.000000000 -0600
+++ dist.c.slf	2010-05-16 01:23:46.000000000 -0500
@@ -1496,7 +1496,7 @@
 	dep->qsize += size_obuf(obuf);
 	if (dep->qsize >= ERTS_DE_BUSY_LIMIT)
 	    dep->qflgs |= ERTS_DE_QFLG_BUSY;
-	if (!force_busy && (dep->qflgs & ERTS_DE_QFLG_BUSY)) {
+	if (!force_busy && (dep->qflgs & ERTS_DE_QFLG_BUSY) && c_p->prio != PRIORITY_MAX) {
 	    erts_smp_spin_unlock(&dep->qlock);
 
 	    plp = erts_proclist_create(c_p);

It isn't really specific to net_kernel, but there aren't many processes
(within OTP, at least) that run at max priority and communicate with the
outside world, right?  And the worst that could happen would be to have
the port's queue get bigger past the ERTS_DE_BUSY_LIMIT before a tick
timeout closed the connection (and thus frees the port's queued data),
perhaps?

-Scott

From fritchie@REDACTED  Mon May 17 08:34:41 2010
From: fritchie@REDACTED (Scott Lystig Fritchie)
Date: Mon, 17 May 2010 01:34:41 -0500
Subject: Why would a <3K heap take 300+ milliseconds to GC?
Message-ID: <59823.1274078081@snookles.snookles.com>

Hi, sorry this is more of a something-looks-too-weird-to-be-good thing
and not an honest bug.  We've witnessed a couple of processes on an
R13B04 node that started taking over 50 milliseconds for a single GC of
a less-than-3K heap ... then stretching to over 300 milliseconds for
(what should be) the same amount of garbage.

Things get weird the longer these two procs run.  They gradually start
triggering 'long_gc' events, where the minimum threshold was 50ms.  They
reach a plateu of roughly 1150-1350 reports/day/process for a few days,
then the # of reports/day/process goes exponential: 32K reports in about
half of one day, i.e. a GC performed for nearly every timer message
received.

The worst single GC time is 349ms.

The rest of the VM was not very busy: less than 20% CPU used on average
for an 8 core box.  On May 14th, the day that the long_gc reporting
happened nearly once/sec, each for 150-270 milliseconds each, average CPU
consumption increased only very slightly.

Does this ring a bell?  It's really, really strange behavior which (we
think) culminated into some behavior unbecoming of a well-behaved
virtual machine.  If this was a legit signal that something was going
wrong, it's worth pursuing IMHO.  I can provide the code for the
gen_server if it'd be helpful.

-Scott

Distribution of long_gc times
=============================

Frequency     Range
---------     -----
     2543      50-100 milliseconds
    21003     100-199 milliseconds
    22704     200-299 milliseconds
      417     300-349 milliseconds

What the proc does
==================

The process receives a message once/second from an external timer that
tells it to net_adm:ping/1 a remote node and then rpc:call/4 to the
application controller to get the list of apps running on the remote
node.  The proc's #state record is roughly 6 fields, contains no
binaries, and only 1 field is ever updated (with a boolean) on each
iteration.  The application being monitored is never running, so the
same code path is taken (and thus (hopefully) the same amount of
garbage) each time.

Count of # of long_gc events per day, plus random sample of monitor message
===========================================================================

NOTE: There were two processes that were reporting these odd long_gc
events, <0.283.0> and <0.285.0>.  These are the counts for only one of
those two procs.

foreach i ( ?? )         ## directories per day, numbered 03-14
  echo -n $i " "
  cat $i/*/*/* | egrep long_gc | egrep '0.283.0' | wc -l
 cat $i/*/*/* | egrep long_gc | tail -1
 echo ""
 end
03  0

04  0

05  7
{monitor,<0.283.0>,long_gc,[{timeout,57},{old_heap_block_size,2584},{heap_block_size,987},{mbuf_size,0},{stack_size,17},{old_heap_size,1387},{heap_size,26}]}

06  341
{monitor,<0.285.0>,long_gc,[{timeout,52},{old_heap_block_size,2584},{heap_block_size,2584},{mbuf_size,0},{stack_size,27},{old_heap_size,79},{heap_size,32}]}

07  1195
{monitor,<0.283.0>,long_gc,[{timeout,95},{old_heap_block_size,2584},{heap_block_size,1597},{mbuf_size,0},{stack_size,17},{old_heap_size,1387},{heap_size,26}]}

08  1152
{monitor,<0.285.0>,long_gc,[{timeout,106},{old_heap_block_size,2584},{heap_block_size,2584},{mbuf_size,0},{stack_size,27},{old_heap_size,79},{heap_size,35}]}

09  1238
{monitor,<0.285.0>,long_gc,[{timeout,162},{old_heap_block_size,2584},{heap_block_size,987},{mbuf_size,0},{stack_size,32},{old_heap_size,79},{heap_size,57}]}

10  1324
{monitor,<0.283.0>,long_gc,[{timeout,121},{old_heap_block_size,2584},{heap_block_size,987},{mbuf_size,0},{stack_size,34},{old_heap_size,1387},{heap_size,36}]}

11  1332
{monitor,<0.283.0>,long_gc,[{timeout,150},{old_heap_block_size,2584},{heap_block_size,1597},{mbuf_size,0},{stack_size,32},{old_heap_size,1387},{heap_size,32}]}

12  1340
{monitor,<0.283.0>,long_gc,[{timeout,208},{old_heap_block_size,2584},{heap_block_size,377},{mbuf_size,0},{stack_size,14},{old_heap_size,1387},{heap_size,34}]}

13  6198
{monitor,<0.283.0>,long_gc,[{timeout,174},{old_heap_block_size,2584},{heap_block_size,987},{mbuf_size,0},{stack_size,25},{old_heap_size,1387},{heap_size,608}]}

14  32540
{monitor,<0.283.0>,long_gc,[{timeout,185},{old_heap_block_size,2584},{heap_block_size,987},{mbuf_size,0},{stack_size,25},{old_heap_size,1387},{heap_size,608}]}


Stack backtrace of <0.283.0>
============================

=proc:<0.283.0>
State: Scheduled
Spawned as: proc_lib:init_p/5
Spawned by: <0.92.0>
Started: Fri Mar 26 13:26:55 2010
Message queue length: 1
Message queue: [check_status]
Number of heap fragments: 0
Heap fragment data: 0
Link list: []
Dictionary:
[{'$initial_call',{brick_clientmon,init,1}},{i_am_monitoring,{'down_app@REDACTED',gdss}},{'$ancestors',[brick_mon_sup,brick_admin_sup,brick_sup,<0.88.0>]}]
Reductions: 472532540
Stack+heap: 987
OldHeap: 2584
Heap unused: 373
OldHeap unused: 2584
Stack dump:
Program counter: 0x00002aaaaab9f2c8 (gen_server:loop/6 + 288)
CP: 0x0000000000000000 (invalid)
arity = 0

0x00002aaac7d6c120 Return addr 0x00002aaaace58d70
(proc_lib:init_p_do_apply/3 + 56)
y(0)     []
y(1)     infinity
y(2)     brick_clientmon
y(3)
{state,'down_app@REDACTED',gdss,#Fun<brick_admin.39.100067762>,#Fun<brick_admin.40.108326667>,#Ref<0.0.0.2895>,false}
y(4)     <0.283.0>
y(5)     <0.92.0>

0x00002aaac7d6c158 Return addr 0x0000000000867be8 (<terminate process
normally>)
y(0)     Catch 0x00002aaaace58d90 (proc_lib:init_p_do_apply/3 + 88)


From fritchie@REDACTED  Wed May 19 19:42:56 2010
From: fritchie@REDACTED (Scott Lystig Fritchie)
Date: Wed, 19 May 2010 12:42:56 -0500
Subject: [erlang-bugs] net_kernel hang, perhaps blocked by busy_dist_port race? 
In-Reply-To: Message of "Sun, 16 May 2010 02:07:53 CDT."
             <92494.1273993673@snookles.snookles.com> 
Message-ID: <38134.1274290976@snookles.snookles.com>

{tap} {tap} Is this microphone on?  {tap}

So, another idea to avoid blocking net_kernel would be an Erlang-only
fix: all handle_call() replies would be sent by spawning a new process
that calls gen_server:reply().

However, in the recipe that I posted over the weekend, I've also managed
to block one or two of global's processes:
{registered_name,global_name_server} which is usually <0.12.0> under
R13B04 and also <0.13.0> which also appears to be related to global.

-Scott

From hans.bolinder@REDACTED  Fri May 21 08:40:04 2010
From: hans.bolinder@REDACTED (Hans Bolinder)
Date: Fri, 21 May 2010 08:40:04 +0200
Subject: [erlang-bugs] net_kernel hang, perhaps blocked by busy_dist_port race? 
In-Reply-To: <80213.1273978319@snookles.snookles.com>
References: <63794.1273957276@snookles.snookles.com>
	<80213.1273978319@snookles.snookles.com>
Message-ID: <19446.10948.910671.108710@ornendil.du.uab.ericsson.se>

[Scott Lystig Fritchie:]
> New update: recipe to duplicate.

Great work. Much appreciated!

We've been able to reproduce the scenario you describe.

Best regards,

Hans Bolinder, Erlang/OTP team, Ericsson

From pguyot@REDACTED  Fri May 21 10:37:07 2010
From: pguyot@REDACTED (Paul Guyot)
Date: Fri, 21 May 2010 10:37:07 +0200
Subject: Code path is not updated if no module is loaded in the appup
Message-ID: <2183FE2B-466E-4DA9-AF18-09F6FB0627EC@kallisys.net>

Hello,

With R13B04 (and earlier), we noticed that the code path is not updated when installing a release if no module is to be added/deleted/reloaded in the .appup file. This is a problem as the release can consist in an update to non-code files, such as MIBs, resources in priv directory, etc.

For example, if the appup is the following:

{"16",
   [{"15",
       [
       ]}
       ],
   [{"15",
       [ 
       ]}
   ]
}.

The generated relup is the following:

{"802",[{"801",[],[point_of_no_return]}],[{"801",[],[point_of_no_return]}]}.

application:which_applications returns that the application version "16" is installed and running. However, code:priv_dir and code:get_path refers to version "15" of the application.

Regards,

Paul


From fritchie@REDACTED  Fri May 21 17:44:06 2010
From: fritchie@REDACTED (Scott Lystig Fritchie)
Date: Fri, 21 May 2010 10:44:06 -0500
Subject: [erlang-bugs] net_kernel hang, perhaps blocked by busy_dist_port race? 
In-Reply-To: Message of "Fri, 21 May 2010 08:40:04 +0200."
             <19446.10948.910671.108710@ornendil.du.uab.ericsson.se> 
Message-ID: <74337.1274456646@snookles.snookles.com>

Hans Bolinder <hans.bolinder@REDACTED> wrote:

>> New update: recipe to duplicate.

hb> Great work. Much appreciated!

hb> We've been able to reproduce the scenario you describe.

Cool.

Attached is another idea for a fix.  Instead of a VM fix, it patches
net_kernel.erl to avoid direct replies by the 'net_kernel' process.
It's perhaps better by not mucking with the VM, perhaps worse because it
isn't clear if the same port blocking + process unscheduling for other
processes such as 'global_group' could cause similar problems?

-Scott
-------------- next part --------------
--- /usr/local/src/erlang/otp_src_R13B04/lib/kernel/src/net_kernel.erl.orig	2009-11-20 07:29:33.000000000 -0600
+++ ./net_kernel.erl	2010-05-20 18:21:34.000000000 -0500
@@ -354,13 +354,13 @@
 %% The response is delayed until the connection is up and
 %% running.
 %%
-handle_call({connect, _, Node}, _From, State) when Node =:= node() ->
-    {reply, true, State};
+handle_call({connect, _, Node}, From, State) when Node =:= node() ->
+    async_reply({reply, true, State}, From);
 handle_call({connect, Type, Node}, From, State) ->
     verbose({connect, Type, Node}, 1, State),
     case ets:lookup(sys_dist, Node) of
 	[Conn] when Conn#connection.state =:= up ->
-	    {reply, true, State};
+	    async_reply({reply, true, State}, From);
 	[Conn] when Conn#connection.state =:= pending ->
 	    Waiting = Conn#connection.waiting,
 	    ets:insert(sys_dist, Conn#connection{waiting = [From|Waiting]}),
@@ -376,19 +376,19 @@
 		    {noreply,State#state{conn_owners=Owners}};
 		_  ->
 		    ?connect_failure(Node, {setup_call, failed}),
-		    {reply, false, State}
+		    async_reply({reply, false, State}, From)
 	    end
     end;
 
 %%
 %% Close the connection to Node.
 %%
-handle_call({disconnect, Node}, _From, State) when Node =:= node() ->
-    {reply, false, State};
-handle_call({disconnect, Node}, _From, State) ->
+handle_call({disconnect, Node}, From, State) when Node =:= node() ->
+    async_reply({reply, false, State}, From);
+handle_call({disconnect, Node}, From, State) ->
     verbose({disconnect, Node}, 1, State),
     {Reply, State1} = do_disconnect(Node, State),
-    {reply, Reply, State1};
+    async_reply({reply, Reply, State1}, From);
 
 %% 
 %% The spawn/4 BIF ends up here.
@@ -411,39 +411,39 @@
 %% 
 %% Only allow certain nodes.
 %% 
-handle_call({allow, Nodes}, _From, State) ->
+handle_call({allow, Nodes}, From, State) ->
     case all_atoms(Nodes) of
 	true ->
 	    Allowed = State#state.allowed,
-	    {reply,ok,State#state{allowed = Allowed ++ Nodes}};  
+	    async_reply({reply,ok,State#state{allowed = Allowed ++ Nodes}}, From);  
 	false ->
-	    {reply,error,State}
+	    async_reply({reply,error,State}, From)
     end;
 
 %% 
 %% authentication, used by auth. Simply works as this:
 %% if the message comes through, the other node IS authorized.
 %% 
-handle_call({is_auth, _Node}, _From, State) ->
-    {reply,yes,State};
+handle_call({is_auth, _Node}, From, State) ->
+    async_reply({reply,yes,State}, From);
 
 %% 
 %% Not applicable any longer !?
 %% 
 handle_call({apply,_Mod,_Fun,_Args}, {From,Tag}, State) 
   when is_pid(From), node(From) =:= node() ->
-    gen_server:reply({From,Tag}, not_implemented),
+    async_gen_server_reply({From,Tag}, not_implemented),
 %    Port = State#state.port,
 %    catch apply(Mod,Fun,[Port|Args]),
     {noreply,State};
 
-handle_call(longnames, _From, State) ->
-    {reply, get(longnames), State};
+handle_call(longnames, From, State) ->
+    async_reply({reply, get(longnames), State}, From);
 
-handle_call({update_publish_nodes, Ns}, _From, State) ->
-    {reply, ok, State#state{publish_on_nodes = Ns}};
+handle_call({update_publish_nodes, Ns}, From, State) ->
+    async_reply({reply, ok, State#state{publish_on_nodes = Ns}}, From);
 
-handle_call({publish_on_node, Node}, _From, State) ->
+handle_call({publish_on_node, Node}, From, State) ->
     NewState = case State#state.publish_on_nodes of
 		   undefined ->
 		       State#state{publish_on_nodes =
@@ -457,11 +457,11 @@
 		  Nodes ->
 		      lists:member(Node, Nodes)
 	      end,
-    {reply, Publish, NewState};
+    async_reply({reply, Publish, NewState}, From);
 
 
-handle_call({verbose, Level}, _From, State) ->
-    {reply, State#state.verbose, State#state{verbose = Level}};
+handle_call({verbose, Level}, From, State) ->
+    async_reply({reply, State#state.verbose, State#state{verbose = Level}}, From);
 
 %%
 %% Set new ticktime
@@ -471,16 +471,16 @@
 %% #tick_change{} record if the ticker process has been upgraded;
 %% otherwise, an integer or an atom.
 
-handle_call(ticktime, _, #state{tick = #tick{time = T}} = State) ->
-    {reply, T, State};
-handle_call(ticktime, _, #state{tick = #tick_change{time = T}} = State) ->
-    {reply, {ongoing_change_to, T}, State};
+handle_call(ticktime, From, #state{tick = #tick{time = T}} = State) ->
+    async_reply({reply, T, State}, From);
+handle_call(ticktime, From, #state{tick = #tick_change{time = T}} = State) ->
+    async_reply({reply, {ongoing_change_to, T}, State}, From);
 
-handle_call({new_ticktime,T,_TP}, _, #state{tick = #tick{time = T}} = State) ->
+handle_call({new_ticktime,T,_TP}, From, #state{tick = #tick{time = T}} = State) ->
     ?tckr_dbg(no_tick_change),
-    {reply, unchanged, State};
+    async_reply({reply, unchanged, State}, From);
 
-handle_call({new_ticktime,T,TP}, _, #state{tick = #tick{ticker = Tckr,
+handle_call({new_ticktime,T,TP}, From, #state{tick = #tick{ticker = Tckr,
 							time = OT}} = State) ->
     ?tckr_dbg(initiating_tick_change),
     start_aux_ticker(T, OT, TP),
@@ -493,14 +493,14 @@
 		  ?tckr_dbg(shorter_ticktime),
 		  shorter
 	  end,
-    {reply, change_initiated, State#state{tick = #tick_change{ticker = Tckr,
+    async_reply({reply, change_initiated, State#state{tick = #tick_change{ticker = Tckr,
 							      time = T,
-							      how = How}}};
+							      how = How}}}, From);
 
-handle_call({new_ticktime,_,_},
+handle_call({new_ticktime,From,_},
 	    _,
 	    #state{tick = #tick_change{time = T}} = State) ->
-    {reply, {ongoing_change_to, T}, State}.
+    async_reply({reply, {ongoing_change_to, T}, State}, From).
 
 %% ------------------------------------------------------------
 %% handle_cast.
@@ -1079,11 +1079,11 @@
 
 spawn_func(link,{From,Tag},M,F,A,Gleader) ->
     link(From),
-    gen_server:reply({From,Tag},self()),  %% ahhh
+    async_gen_server_reply({From,Tag},self()),  %% ahhh
     group_leader(Gleader,self()),
     apply(M,F,A);
 spawn_func(_,{From,Tag},M,F,A,Gleader) ->
-    gen_server:reply({From,Tag},self()),  %% ahhh
+    async_gen_server_reply({From,Tag},self()),  %% ahhh
     group_leader(Gleader,self()),
     apply(M,F,A).
 
@@ -1409,7 +1409,7 @@
     reply_waiting1(lists:reverse(Waiting), Rep).
 
 reply_waiting1([From|W], Rep) ->
-    gen_server:reply(From, Rep),
+    async_gen_server_reply(From, Rep),
     reply_waiting1(W, Rep);
 reply_waiting1([], _) ->
     ok.
@@ -1511,3 +1511,10 @@
 
 getnode(P) when is_pid(P) -> node(P);
 getnode(P) -> P.
+
+async_reply({reply, Msg, State}, From) ->
+    async_gen_server_reply(From, Msg),
+    {noreply, State}.
+
+async_gen_server_reply(From, Msg) ->
+    spawn(fun() -> gen_server:reply(From, Msg) end).

From bob@REDACTED  Wed May 26 02:00:39 2010
From: bob@REDACTED (Bob Ippolito)
Date: Tue, 25 May 2010 17:00:39 -0700
Subject: R13B04 inet_res:resolve/4 inet_udp Port leak
Message-ID: <AANLkTiktTZZDafs3DX9agszbBrmNNOs_o5fQPNWLx0I3@mail.gmail.com>

It appears that there may be an inet_udp Port leak in
inet_res:resolve/4, our current workaround is to spawn a new process
to call this function. We've noticed this primarily for a service that
regularly does a UDP DNS query that fails (because the response is too
big) and then we retry over TCP.

This is what the state of the process looked like when it was leaking ports:

(node@REDACTED)1> length(lists:filter(fun erlang:is_port/1, element(2,
erlang:process_info(whereis(dns_gen_server), links)))).
577
(node@REDACTED)2> lists:usort([erlang:port_info(P, name) || P <-
lists:filter(fun erlang:is_port/1, element(2,
erlang:process_info(whereis(dns_gen_server), links)))]).
[{name,"udp_inet"}]

The code looked like this, before the workaround was implemented:

%% @spec dns(string()) -> [string()]
%% @doc Return the A records (IPv4 IPs) as strings for the given Host name.
%%     This may return an empty list if there no A records for this Host name.
dns(Host) when is_list(Host) ->
    dns(Host, fun inet_res:resolve/4).

dns(Host, ResolveFun) ->
    case ResolveFun(Host, in, a, []) of
        {ok, Msg} ->
            ips_for_answers(Msg);
        {error, {nxdomain, _}} ->
            [];
        {error, timeout} ->
            %% retry with TCP
            case ResolveFun(Host, in, a, [{usevc, true}]) of
                {ok, Msg} ->
                    ips_for_answers(Msg);
                {error, {nxdomain, _}} ->
                    [];
                Error = {error, _} ->
                    Error
            end;
        Error = {error, _} ->
            Error
    end.

ips_for_answers(Msg) ->
    [inet_parse:ntoa(inet_dns:rr(Answer, data))
     || Answer <- inet_dns:msg(Msg, anlist)].

The workaround we used was to call it indirectly with this function, I
couldn't find anything in OTP that did the same thing that didn't have
local call optimizations.

%% @spec process_apply(atom(), atom(), [term()]) -> term()
%% @doc erlang:apply(M, F, A) in a temporary process and return the results.
process_apply(M,F,A) ->
    %% We can't just use rpc here because there's a local call optimization.
    Parent = self(),
    Fun = fun () ->
                  try
                      Parent ! {self(), erlang:apply(M, F, A)}
                  catch
                      Class:Reason ->
                          Stacktrace = erlang:get_stacktrace(),
                          Parent ! {self(), Class, Reason, Stacktrace}
                  end
          end,
    {Pid, Ref} = erlang:spawn_monitor(Fun),
    receive
        {Pid, Res} ->
            receive {'DOWN', Ref, process, Pid, _} -> ok end,
            Res;
        {Pid, Class, Reason, Stacktrace} ->
            receive {'DOWN', Ref, process, Pid, _} -> ok end,
            erlang:error(erlang:raise(Class, Reason, Stacktrace));
        {'DOWN', Ref, process, Pid, Reason} ->
            erlang:exit(Reason)
    end.

From zl9d97p02@REDACTED  Wed May 26 02:56:30 2010
From: zl9d97p02@REDACTED (Simon Cornish)
Date: Tue, 25 May 2010 17:56:30 -0700
Subject: R12 emulator crashes with zero-length port_control binary
In-Reply-To: <AANLkTilKYAMcyYkz_5X3NCtLkLseFceZ-x_WiPhBA1xk@mail.gmail.com>
References: <AANLkTilKYAMcyYkz_5X3NCtLkLseFceZ-x_WiPhBA1xk@mail.gmail.com>
Message-ID: <25026-1274835390-951678@sneakemail.com>

If a linked-in driver returns 0 to a port_control call and
PORT_CONTROL_FLAG_BINARY is set then the beam emulator will probably
crash or otherwise misbehave.

Attached is a patch for those who are stuck on R12 and might get
bitten by this. Tested on R12B-3, applies also to R12B-5.

It's already fixed (in a different way) in R13+

/Simon
-------------- next part --------------
A non-text attachment was scrubbed...
Name: io.c.patch
Type: application/octet-stream
Size: 572 bytes
Desc: not available
URL: <http://erlang.org/pipermail/erlang-bugs/attachments/20100525/aea2ad6f/attachment.obj>

From raimo+erlang-bugs@REDACTED  Wed May 26 11:06:26 2010
From: raimo+erlang-bugs@REDACTED (Raimo Niskanen)
Date: Wed, 26 May 2010 11:06:26 +0200
Subject: [erlang-bugs] R13B04 inet_res:resolve/4 inet_udp Port leak
In-Reply-To: <AANLkTiktTZZDafs3DX9agszbBrmNNOs_o5fQPNWLx0I3@mail.gmail.com>
References: <AANLkTiktTZZDafs3DX9agszbBrmNNOs_o5fQPNWLx0I3@mail.gmail.com>
Message-ID: <20100526090626.GA17931@erix.ericsson.se>

By reading the code it seems there is a bug when all nameservers
return an answer that causes decode errors, or can not be
contacted (enetunreach or econnrefused); then an
UDP port (or maybe two; one inet and one inet6) is leaked
since the inet_res:udp_close/1 is not called.

This should be fixed with:

diff --git a/lib/kernel/src/inet_res.erl b/lib/kernel/src/inet_res.erl
index 9b9e078..3d38a01 100644
--- a/lib/kernel/src/inet_res.erl
+++ b/lib/kernel/src/inet_res.erl
@@ -592,6 +592,7 @@ query_retries(_Q, _NSs, _Timer, Retry, Retry, S) ->
 query_retries(Q, NSs, Timer, Retry, I, S0) ->
     Num = length(NSs),
     if Num =:= 0 ->
+           udp_close(S),
            {error,timeout};
        true ->
            case query_nss(Q, NSs, Timer, Retry, I, S0, []) of

This "retry with TCP" trick of yours should really not be necessary
since inet_res retries with TCP if it gets a truncated UDP answer.
Have you got some other case when retrying with TCP is essential?

Or, does your DNS server produce a (valid?) result that
triggers a debug bug in inet_res, causing the decode error,
triggering the port leak bug, forcing you to retry with TCP?

On Tue, May 25, 2010 at 05:00:39PM -0700, Bob Ippolito wrote:
> It appears that there may be an inet_udp Port leak in
> inet_res:resolve/4, our current workaround is to spawn a new process
> to call this function. We've noticed this primarily for a service that
> regularly does a UDP DNS query that fails (because the response is too
> big) and then we retry over TCP.
> 
> This is what the state of the process looked like when it was leaking ports:
> 
> (node@REDACTED)1> length(lists:filter(fun erlang:is_port/1, element(2,
> erlang:process_info(whereis(dns_gen_server), links)))).
> 577
> (node@REDACTED)2> lists:usort([erlang:port_info(P, name) || P <-
> lists:filter(fun erlang:is_port/1, element(2,
> erlang:process_info(whereis(dns_gen_server), links)))]).
> [{name,"udp_inet"}]
> 
> The code looked like this, before the workaround was implemented:
> 
> %% @spec dns(string()) -> [string()]
> %% @doc Return the A records (IPv4 IPs) as strings for the given Host name.
> %%     This may return an empty list if there no A records for this Host name.
> dns(Host) when is_list(Host) ->
>     dns(Host, fun inet_res:resolve/4).
> 
> dns(Host, ResolveFun) ->
>     case ResolveFun(Host, in, a, []) of
>         {ok, Msg} ->
>             ips_for_answers(Msg);
>         {error, {nxdomain, _}} ->
>             [];
>         {error, timeout} ->
>             %% retry with TCP
>             case ResolveFun(Host, in, a, [{usevc, true}]) of
>                 {ok, Msg} ->
>                     ips_for_answers(Msg);
>                 {error, {nxdomain, _}} ->
>                     [];
>                 Error = {error, _} ->
>                     Error
>             end;
>         Error = {error, _} ->
>             Error
>     end.
> 
> ips_for_answers(Msg) ->
>     [inet_parse:ntoa(inet_dns:rr(Answer, data))
>      || Answer <- inet_dns:msg(Msg, anlist)].
> 
> The workaround we used was to call it indirectly with this function, I
> couldn't find anything in OTP that did the same thing that didn't have
> local call optimizations.
> 
> %% @spec process_apply(atom(), atom(), [term()]) -> term()
> %% @doc erlang:apply(M, F, A) in a temporary process and return the results.
> process_apply(M,F,A) ->
>     %% We can't just use rpc here because there's a local call optimization.
>     Parent = self(),
>     Fun = fun () ->
>                   try
>                       Parent ! {self(), erlang:apply(M, F, A)}
>                   catch
>                       Class:Reason ->
>                           Stacktrace = erlang:get_stacktrace(),
>                           Parent ! {self(), Class, Reason, Stacktrace}
>                   end
>           end,
>     {Pid, Ref} = erlang:spawn_monitor(Fun),
>     receive
>         {Pid, Res} ->
>             receive {'DOWN', Ref, process, Pid, _} -> ok end,
>             Res;
>         {Pid, Class, Reason, Stacktrace} ->
>             receive {'DOWN', Ref, process, Pid, _} -> ok end,
>             erlang:error(erlang:raise(Class, Reason, Stacktrace));
>         {'DOWN', Ref, process, Pid, Reason} ->
>             erlang:exit(Reason)
>     end.
> 
> ________________________________________________________________
> erlang-bugs (at) erlang.org mailing list.
> See http://www.erlang.org/faq.html
> To unsubscribe; mailto:erlang-bugs-unsubscribe@REDACTED

-- 

/ Raimo Niskanen, Erlang/OTP, Ericsson AB

From bob@REDACTED  Wed May 26 17:02:48 2010
From: bob@REDACTED (Bob Ippolito)
Date: Wed, 26 May 2010 08:02:48 -0700
Subject: [erlang-bugs] R13B04 inet_res:resolve/4 inet_udp Port leak
In-Reply-To: <20100526090626.GA17931@erix.ericsson.se>
References: <AANLkTiktTZZDafs3DX9agszbBrmNNOs_o5fQPNWLx0I3@mail.gmail.com>
	 <20100526090626.GA17931@erix.ericsson.se>
Message-ID: <AANLkTiltmf6LQCTE98lkh-c5sVpFxkjh5w6YKeK9Qf-m@mail.gmail.com>

Well, I'm not sure exactly which scenario is happening because I
haven't looked at the packets yet, but the manual TCP retry is
required.

mochi@REDACTED:~$ /mochi/opt/erlang-R13B04/bin/erl
Erlang R13B04 (erts-5.7.5) [source] [64-bit] [smp:8:8] [rq:8]
[async-threads:4] [hipe] [kernel-poll:true]

Eshell V5.7.5  (abort with ^G)
1> lists:filter(fun erlang:is_port/1, element(2,
erlang:process_info(self(), links))).
[]
2> inet_res:resolve("mochisvn.erl.mochimedia.net", in, a, []).
{error,timeout}
3> lists:filter(fun erlang:is_port/1, element(2,
erlang:process_info(self(), links))).
[#Port<0.514>]
4> element(1, inet_res:resolve("mochisvn.erl.mochimedia.net", in, a, [usevc])).
ok


On Wed, May 26, 2010 at 2:06 AM, Raimo Niskanen
<raimo+erlang-bugs@REDACTED> wrote:
> By reading the code it seems there is a bug when all nameservers
> return an answer that causes decode errors, or can not be
> contacted (enetunreach or econnrefused); then an
> UDP port (or maybe two; one inet and one inet6) is leaked
> since the inet_res:udp_close/1 is not called.
>
> This should be fixed with:
>
> diff --git a/lib/kernel/src/inet_res.erl b/lib/kernel/src/inet_res.erl
> index 9b9e078..3d38a01 100644
> --- a/lib/kernel/src/inet_res.erl
> +++ b/lib/kernel/src/inet_res.erl
> @@ -592,6 +592,7 @@ query_retries(_Q, _NSs, _Timer, Retry, Retry, S) ->
> ?query_retries(Q, NSs, Timer, Retry, I, S0) ->
> ? ? Num = length(NSs),
> ? ? if Num =:= 0 ->
> + ? ? ? ? ? udp_close(S),
> ? ? ? ? ? ?{error,timeout};
> ? ? ? ?true ->
> ? ? ? ? ? ?case query_nss(Q, NSs, Timer, Retry, I, S0, []) of
>
> This "retry with TCP" trick of yours should really not be necessary
> since inet_res retries with TCP if it gets a truncated UDP answer.
> Have you got some other case when retrying with TCP is essential?
>
> Or, does your DNS server produce a (valid?) result that
> triggers a debug bug in inet_res, causing the decode error,
> triggering the port leak bug, forcing you to retry with TCP?
>
> On Tue, May 25, 2010 at 05:00:39PM -0700, Bob Ippolito wrote:
>> It appears that there may be an inet_udp Port leak in
>> inet_res:resolve/4, our current workaround is to spawn a new process
>> to call this function. We've noticed this primarily for a service that
>> regularly does a UDP DNS query that fails (because the response is too
>> big) and then we retry over TCP.
>>
>> This is what the state of the process looked like when it was leaking ports:
>>
>> (node@REDACTED)1> length(lists:filter(fun erlang:is_port/1, element(2,
>> erlang:process_info(whereis(dns_gen_server), links)))).
>> 577
>> (node@REDACTED)2> lists:usort([erlang:port_info(P, name) || P <-
>> lists:filter(fun erlang:is_port/1, element(2,
>> erlang:process_info(whereis(dns_gen_server), links)))]).
>> [{name,"udp_inet"}]
>>
>> The code looked like this, before the workaround was implemented:
>>
>> %% @spec dns(string()) -> [string()]
>> %% @doc Return the A records (IPv4 IPs) as strings for the given Host name.
>> %% ? ? This may return an empty list if there no A records for this Host name.
>> dns(Host) when is_list(Host) ->
>> ? ? dns(Host, fun inet_res:resolve/4).
>>
>> dns(Host, ResolveFun) ->
>> ? ? case ResolveFun(Host, in, a, []) of
>> ? ? ? ? {ok, Msg} ->
>> ? ? ? ? ? ? ips_for_answers(Msg);
>> ? ? ? ? {error, {nxdomain, _}} ->
>> ? ? ? ? ? ? [];
>> ? ? ? ? {error, timeout} ->
>> ? ? ? ? ? ? %% retry with TCP
>> ? ? ? ? ? ? case ResolveFun(Host, in, a, [{usevc, true}]) of
>> ? ? ? ? ? ? ? ? {ok, Msg} ->
>> ? ? ? ? ? ? ? ? ? ? ips_for_answers(Msg);
>> ? ? ? ? ? ? ? ? {error, {nxdomain, _}} ->
>> ? ? ? ? ? ? ? ? ? ? [];
>> ? ? ? ? ? ? ? ? Error = {error, _} ->
>> ? ? ? ? ? ? ? ? ? ? Error
>> ? ? ? ? ? ? end;
>> ? ? ? ? Error = {error, _} ->
>> ? ? ? ? ? ? Error
>> ? ? end.
>>
>> ips_for_answers(Msg) ->
>> ? ? [inet_parse:ntoa(inet_dns:rr(Answer, data))
>> ? ? ?|| Answer <- inet_dns:msg(Msg, anlist)].
>>
>> The workaround we used was to call it indirectly with this function, I
>> couldn't find anything in OTP that did the same thing that didn't have
>> local call optimizations.
>>
>> %% @spec process_apply(atom(), atom(), [term()]) -> term()
>> %% @doc erlang:apply(M, F, A) in a temporary process and return the results.
>> process_apply(M,F,A) ->
>> ? ? %% We can't just use rpc here because there's a local call optimization.
>> ? ? Parent = self(),
>> ? ? Fun = fun () ->
>> ? ? ? ? ? ? ? ? ? try
>> ? ? ? ? ? ? ? ? ? ? ? Parent ! {self(), erlang:apply(M, F, A)}
>> ? ? ? ? ? ? ? ? ? catch
>> ? ? ? ? ? ? ? ? ? ? ? Class:Reason ->
>> ? ? ? ? ? ? ? ? ? ? ? ? ? Stacktrace = erlang:get_stacktrace(),
>> ? ? ? ? ? ? ? ? ? ? ? ? ? Parent ! {self(), Class, Reason, Stacktrace}
>> ? ? ? ? ? ? ? ? ? end
>> ? ? ? ? ? end,
>> ? ? {Pid, Ref} = erlang:spawn_monitor(Fun),
>> ? ? receive
>> ? ? ? ? {Pid, Res} ->
>> ? ? ? ? ? ? receive {'DOWN', Ref, process, Pid, _} -> ok end,
>> ? ? ? ? ? ? Res;
>> ? ? ? ? {Pid, Class, Reason, Stacktrace} ->
>> ? ? ? ? ? ? receive {'DOWN', Ref, process, Pid, _} -> ok end,
>> ? ? ? ? ? ? erlang:error(erlang:raise(Class, Reason, Stacktrace));
>> ? ? ? ? {'DOWN', Ref, process, Pid, Reason} ->
>> ? ? ? ? ? ? erlang:exit(Reason)
>> ? ? end.
>>
>> ________________________________________________________________
>> erlang-bugs (at) erlang.org mailing list.
>> See http://www.erlang.org/faq.html
>> To unsubscribe; mailto:erlang-bugs-unsubscribe@REDACTED
>
> --
>
> / Raimo Niskanen, Erlang/OTP, Ericsson AB
>

From bob@REDACTED  Wed May 26 20:59:26 2010
From: bob@REDACTED (Bob Ippolito)
Date: Wed, 26 May 2010 11:59:26 -0700
Subject: [erlang-bugs] R13B04 inet_res:resolve/4 inet_udp Port leak
In-Reply-To: <AANLkTiltmf6LQCTE98lkh-c5sVpFxkjh5w6YKeK9Qf-m@mail.gmail.com>
References: <AANLkTiktTZZDafs3DX9agszbBrmNNOs_o5fQPNWLx0I3@mail.gmail.com>
	 <20100526090626.GA17931@erix.ericsson.se>
	 <AANLkTiltmf6LQCTE98lkh-c5sVpFxkjh5w6YKeK9Qf-m@mail.gmail.com>
Message-ID: <AANLkTinr4UpZ2Zpb4XMe90Z0hnkYKcuRL_x3m-i1THqy@mail.gmail.com>

Here's the DNS packet that is being received as a response to the query:

1> inet_dns:decode(<<0,1,131,128,0,1,0,60,0,0,0,0,8,109,111,99,104,105,115,118,110,
3,101,114,108,10,109,111,99,104,105,109,101,100,105,97,3,110,
101,116,0,0,1,0,1>>).
{error,fmt}

On Wed, May 26, 2010 at 8:02 AM, Bob Ippolito <bob@REDACTED> wrote:
> Well, I'm not sure exactly which scenario is happening because I
> haven't looked at the packets yet, but the manual TCP retry is
> required.
>
> mochi@REDACTED:~$ /mochi/opt/erlang-R13B04/bin/erl
> Erlang R13B04 (erts-5.7.5) [source] [64-bit] [smp:8:8] [rq:8]
> [async-threads:4] [hipe] [kernel-poll:true]
>
> Eshell V5.7.5 ?(abort with ^G)
> 1> lists:filter(fun erlang:is_port/1, element(2,
> erlang:process_info(self(), links))).
> []
> 2> inet_res:resolve("mochisvn.erl.mochimedia.net", in, a, []).
> {error,timeout}
> 3> lists:filter(fun erlang:is_port/1, element(2,
> erlang:process_info(self(), links))).
> [#Port<0.514>]
> 4> element(1, inet_res:resolve("mochisvn.erl.mochimedia.net", in, a, [usevc])).
> ok
>
>
> On Wed, May 26, 2010 at 2:06 AM, Raimo Niskanen
> <raimo+erlang-bugs@REDACTED> wrote:
>> By reading the code it seems there is a bug when all nameservers
>> return an answer that causes decode errors, or can not be
>> contacted (enetunreach or econnrefused); then an
>> UDP port (or maybe two; one inet and one inet6) is leaked
>> since the inet_res:udp_close/1 is not called.
>>
>> This should be fixed with:
>>
>> diff --git a/lib/kernel/src/inet_res.erl b/lib/kernel/src/inet_res.erl
>> index 9b9e078..3d38a01 100644
>> --- a/lib/kernel/src/inet_res.erl
>> +++ b/lib/kernel/src/inet_res.erl
>> @@ -592,6 +592,7 @@ query_retries(_Q, _NSs, _Timer, Retry, Retry, S) ->
>> ?query_retries(Q, NSs, Timer, Retry, I, S0) ->
>> ? ? Num = length(NSs),
>> ? ? if Num =:= 0 ->
>> + ? ? ? ? ? udp_close(S),
>> ? ? ? ? ? ?{error,timeout};
>> ? ? ? ?true ->
>> ? ? ? ? ? ?case query_nss(Q, NSs, Timer, Retry, I, S0, []) of
>>
>> This "retry with TCP" trick of yours should really not be necessary
>> since inet_res retries with TCP if it gets a truncated UDP answer.
>> Have you got some other case when retrying with TCP is essential?
>>
>> Or, does your DNS server produce a (valid?) result that
>> triggers a debug bug in inet_res, causing the decode error,
>> triggering the port leak bug, forcing you to retry with TCP?
>>
>> On Tue, May 25, 2010 at 05:00:39PM -0700, Bob Ippolito wrote:
>>> It appears that there may be an inet_udp Port leak in
>>> inet_res:resolve/4, our current workaround is to spawn a new process
>>> to call this function. We've noticed this primarily for a service that
>>> regularly does a UDP DNS query that fails (because the response is too
>>> big) and then we retry over TCP.
>>>
>>> This is what the state of the process looked like when it was leaking ports:
>>>
>>> (node@REDACTED)1> length(lists:filter(fun erlang:is_port/1, element(2,
>>> erlang:process_info(whereis(dns_gen_server), links)))).
>>> 577
>>> (node@REDACTED)2> lists:usort([erlang:port_info(P, name) || P <-
>>> lists:filter(fun erlang:is_port/1, element(2,
>>> erlang:process_info(whereis(dns_gen_server), links)))]).
>>> [{name,"udp_inet"}]
>>>
>>> The code looked like this, before the workaround was implemented:
>>>
>>> %% @spec dns(string()) -> [string()]
>>> %% @doc Return the A records (IPv4 IPs) as strings for the given Host name.
>>> %% ? ? This may return an empty list if there no A records for this Host name.
>>> dns(Host) when is_list(Host) ->
>>> ? ? dns(Host, fun inet_res:resolve/4).
>>>
>>> dns(Host, ResolveFun) ->
>>> ? ? case ResolveFun(Host, in, a, []) of
>>> ? ? ? ? {ok, Msg} ->
>>> ? ? ? ? ? ? ips_for_answers(Msg);
>>> ? ? ? ? {error, {nxdomain, _}} ->
>>> ? ? ? ? ? ? [];
>>> ? ? ? ? {error, timeout} ->
>>> ? ? ? ? ? ? %% retry with TCP
>>> ? ? ? ? ? ? case ResolveFun(Host, in, a, [{usevc, true}]) of
>>> ? ? ? ? ? ? ? ? {ok, Msg} ->
>>> ? ? ? ? ? ? ? ? ? ? ips_for_answers(Msg);
>>> ? ? ? ? ? ? ? ? {error, {nxdomain, _}} ->
>>> ? ? ? ? ? ? ? ? ? ? [];
>>> ? ? ? ? ? ? ? ? Error = {error, _} ->
>>> ? ? ? ? ? ? ? ? ? ? Error
>>> ? ? ? ? ? ? end;
>>> ? ? ? ? Error = {error, _} ->
>>> ? ? ? ? ? ? Error
>>> ? ? end.
>>>
>>> ips_for_answers(Msg) ->
>>> ? ? [inet_parse:ntoa(inet_dns:rr(Answer, data))
>>> ? ? ?|| Answer <- inet_dns:msg(Msg, anlist)].
>>>
>>> The workaround we used was to call it indirectly with this function, I
>>> couldn't find anything in OTP that did the same thing that didn't have
>>> local call optimizations.
>>>
>>> %% @spec process_apply(atom(), atom(), [term()]) -> term()
>>> %% @doc erlang:apply(M, F, A) in a temporary process and return the results.
>>> process_apply(M,F,A) ->
>>> ? ? %% We can't just use rpc here because there's a local call optimization.
>>> ? ? Parent = self(),
>>> ? ? Fun = fun () ->
>>> ? ? ? ? ? ? ? ? ? try
>>> ? ? ? ? ? ? ? ? ? ? ? Parent ! {self(), erlang:apply(M, F, A)}
>>> ? ? ? ? ? ? ? ? ? catch
>>> ? ? ? ? ? ? ? ? ? ? ? Class:Reason ->
>>> ? ? ? ? ? ? ? ? ? ? ? ? ? Stacktrace = erlang:get_stacktrace(),
>>> ? ? ? ? ? ? ? ? ? ? ? ? ? Parent ! {self(), Class, Reason, Stacktrace}
>>> ? ? ? ? ? ? ? ? ? end
>>> ? ? ? ? ? end,
>>> ? ? {Pid, Ref} = erlang:spawn_monitor(Fun),
>>> ? ? receive
>>> ? ? ? ? {Pid, Res} ->
>>> ? ? ? ? ? ? receive {'DOWN', Ref, process, Pid, _} -> ok end,
>>> ? ? ? ? ? ? Res;
>>> ? ? ? ? {Pid, Class, Reason, Stacktrace} ->
>>> ? ? ? ? ? ? receive {'DOWN', Ref, process, Pid, _} -> ok end,
>>> ? ? ? ? ? ? erlang:error(erlang:raise(Class, Reason, Stacktrace));
>>> ? ? ? ? {'DOWN', Ref, process, Pid, Reason} ->
>>> ? ? ? ? ? ? erlang:exit(Reason)
>>> ? ? end.
>>>
>>> ________________________________________________________________
>>> erlang-bugs (at) erlang.org mailing list.
>>> See http://www.erlang.org/faq.html
>>> To unsubscribe; mailto:erlang-bugs-unsubscribe@REDACTED
>>
>> --
>>
>> / Raimo Niskanen, Erlang/OTP, Ericsson AB
>>
>

From zhangjr2009@REDACTED  Wed May 26 22:36:45 2010
From: zhangjr2009@REDACTED (JR Zhang)
Date: Wed, 26 May 2010 22:36:45 +0200
Subject: Problem with function ethr_rwmutex_tryrlock
Message-ID: <AANLkTil21ls-4auKnPXPIeLt9mb_RRs9n8NGXRrqS4d5@mail.gmail.com>

Hi list,

I think the fallback version of function ethr_rwmutex_tryrlock in
erts/lib_src/common/ethread.c is not correct. This function should be
similar with pthread_rwlock_tryrdlock. For pthread_rwlock_tryrdlock, the
calling thread acquires the read lock if a writer does not hold the lock and
there are no writers blocked on the lock. But as following code shows,
ethr_rwmutex_tryrlock doesn't get the lock when there is no waiting writer,
and acquires the lock when there are waiting writers. Am I right?

 ethr_rwmutex_tryrlock(ethr_rwmutex *rwmtx)
{
    int res;
#if ETHR_XCHK
    if (!rwmtx || rwmtx->initialized != ETHR_RWMUTEX_INITIALIZED) {
ASSERT(0);
return EINVAL;
    }
#endif
    res = ethr_mutex_trylock__(&rwmtx->mtx);
    if (res != 0)
return res;
    if (!rwmtx->waiting_writers) {
res = ethr_mutex_unlock__(&rwmtx->mtx);
if (res == 0)
    return EBUSY;
return res;
    }
    rwmtx->readers++;
    return ethr_mutex_unlock__(&rwmtx->mtx);
}

Best Regards,
Jianrong Zhang

From raimo+erlang-bugs@REDACTED  Thu May 27 09:43:54 2010
From: raimo+erlang-bugs@REDACTED (Raimo Niskanen)
Date: Thu, 27 May 2010 09:43:54 +0200
Subject: [erlang-bugs] R13B04 inet_res:resolve/4 inet_udp Port leak
In-Reply-To: <AANLkTinr4UpZ2Zpb4XMe90Z0hnkYKcuRL_x3m-i1THqy@mail.gmail.com>
References: <AANLkTiktTZZDafs3DX9agszbBrmNNOs_o5fQPNWLx0I3@mail.gmail.com> <20100526090626.GA17931@erix.ericsson.se> <AANLkTiltmf6LQCTE98lkh-c5sVpFxkjh5w6YKeK9Qf-m@mail.gmail.com> <AANLkTinr4UpZ2Zpb4XMe90Z0hnkYKcuRL_x3m-i1THqy@mail.gmail.com>
Message-ID: <20100527074354.GA5584@erix.ericsson.se>

On Wed, May 26, 2010 at 11:59:26AM -0700, Bob Ippolito wrote:
> Here's the DNS packet that is being received as a response to the query:
> 
> 1> inet_dns:decode(<<0,1,131,128,0,1,0,60,0,0,0,0,8,109,111,99,104,105,115,118,110,
> 3,101,114,108,10,109,111,99,104,105,109,101,100,105,97,3,110,
> 101,116,0,0,1,0,1>>).
> {error,fmt}

Thank you very much! I was about to give you detailed instructions
about how to dig that up :-)

I have spotted the problem.

The DNS reply packet has got the TC (TrunCation) bit set and claims to contain
60 answer records, but actually contains zero. inet_dns expects to find
60 answer records if it says so. This is a hazy part of the
DNS specifications and the resolver I tested truncation on did
not do this kind of self-contradiction, but it _may_ be allowed
by the specification...

I regard it as a bug (or at least need-to-fix-problem) in inet_dns
since it should be real-world compatible not just specification compatible.
It should allow record shortage in a section if the TC bit is set.
I'll try to fix it in R14A.

Can you try my patch adding a missing udp_close(S) to see
if it stops the leaking port problem? That is a more serious bug.

> 
> On Wed, May 26, 2010 at 8:02 AM, Bob Ippolito <bob@REDACTED> wrote:
> > Well, I'm not sure exactly which scenario is happening because I
> > haven't looked at the packets yet, but the manual TCP retry is
> > required.
> >
> > mochi@REDACTED:~$ /mochi/opt/erlang-R13B04/bin/erl
> > Erlang R13B04 (erts-5.7.5) [source] [64-bit] [smp:8:8] [rq:8]
> > [async-threads:4] [hipe] [kernel-poll:true]
> >
> > Eshell V5.7.5 ?(abort with ^G)
> > 1> lists:filter(fun erlang:is_port/1, element(2,
> > erlang:process_info(self(), links))).
> > []
> > 2> inet_res:resolve("mochisvn.erl.mochimedia.net", in, a, []).
> > {error,timeout}
> > 3> lists:filter(fun erlang:is_port/1, element(2,
> > erlang:process_info(self(), links))).
> > [#Port<0.514>]
> > 4> element(1, inet_res:resolve("mochisvn.erl.mochimedia.net", in, a, [usevc])).
> > ok
> >
> >
> > On Wed, May 26, 2010 at 2:06 AM, Raimo Niskanen
> > <raimo+erlang-bugs@REDACTED> wrote:
> >> By reading the code it seems there is a bug when all nameservers
> >> return an answer that causes decode errors, or can not be
> >> contacted (enetunreach or econnrefused); then an
> >> UDP port (or maybe two; one inet and one inet6) is leaked
> >> since the inet_res:udp_close/1 is not called.
> >>
> >> This should be fixed with:
> >>
> >> diff --git a/lib/kernel/src/inet_res.erl b/lib/kernel/src/inet_res.erl
> >> index 9b9e078..3d38a01 100644
> >> --- a/lib/kernel/src/inet_res.erl
> >> +++ b/lib/kernel/src/inet_res.erl
> >> @@ -592,6 +592,7 @@ query_retries(_Q, _NSs, _Timer, Retry, Retry, S) ->
> >> ?query_retries(Q, NSs, Timer, Retry, I, S0) ->
> >> ? ? Num = length(NSs),
> >> ? ? if Num =:= 0 ->
> >> + ? ? ? ? ? udp_close(S),
> >> ? ? ? ? ? ?{error,timeout};
> >> ? ? ? ?true ->
> >> ? ? ? ? ? ?case query_nss(Q, NSs, Timer, Retry, I, S0, []) of
> >>
> >> This "retry with TCP" trick of yours should really not be necessary
> >> since inet_res retries with TCP if it gets a truncated UDP answer.
> >> Have you got some other case when retrying with TCP is essential?
> >>
> >> Or, does your DNS server produce a (valid?) result that
> >> triggers a debug bug in inet_res, causing the decode error,
> >> triggering the port leak bug, forcing you to retry with TCP?
> >>
> >> On Tue, May 25, 2010 at 05:00:39PM -0700, Bob Ippolito wrote:
> >>> It appears that there may be an inet_udp Port leak in
> >>> inet_res:resolve/4, our current workaround is to spawn a new process
> >>> to call this function. We've noticed this primarily for a service that
> >>> regularly does a UDP DNS query that fails (because the response is too
> >>> big) and then we retry over TCP.
> >>>
> >>> This is what the state of the process looked like when it was leaking ports:
> >>>
> >>> (node@REDACTED)1> length(lists:filter(fun erlang:is_port/1, element(2,
> >>> erlang:process_info(whereis(dns_gen_server), links)))).
> >>> 577
> >>> (node@REDACTED)2> lists:usort([erlang:port_info(P, name) || P <-
> >>> lists:filter(fun erlang:is_port/1, element(2,
> >>> erlang:process_info(whereis(dns_gen_server), links)))]).
> >>> [{name,"udp_inet"}]
> >>>
> >>> The code looked like this, before the workaround was implemented:
> >>>
> >>> %% @spec dns(string()) -> [string()]
> >>> %% @doc Return the A records (IPv4 IPs) as strings for the given Host name.
> >>> %% ? ? This may return an empty list if there no A records for this Host name.
> >>> dns(Host) when is_list(Host) ->
> >>> ? ? dns(Host, fun inet_res:resolve/4).
> >>>
> >>> dns(Host, ResolveFun) ->
> >>> ? ? case ResolveFun(Host, in, a, []) of
> >>> ? ? ? ? {ok, Msg} ->
> >>> ? ? ? ? ? ? ips_for_answers(Msg);
> >>> ? ? ? ? {error, {nxdomain, _}} ->
> >>> ? ? ? ? ? ? [];
> >>> ? ? ? ? {error, timeout} ->
> >>> ? ? ? ? ? ? %% retry with TCP
> >>> ? ? ? ? ? ? case ResolveFun(Host, in, a, [{usevc, true}]) of
> >>> ? ? ? ? ? ? ? ? {ok, Msg} ->
> >>> ? ? ? ? ? ? ? ? ? ? ips_for_answers(Msg);
> >>> ? ? ? ? ? ? ? ? {error, {nxdomain, _}} ->
> >>> ? ? ? ? ? ? ? ? ? ? [];
> >>> ? ? ? ? ? ? ? ? Error = {error, _} ->
> >>> ? ? ? ? ? ? ? ? ? ? Error
> >>> ? ? ? ? ? ? end;
> >>> ? ? ? ? Error = {error, _} ->
> >>> ? ? ? ? ? ? Error
> >>> ? ? end.
> >>>
> >>> ips_for_answers(Msg) ->
> >>> ? ? [inet_parse:ntoa(inet_dns:rr(Answer, data))
> >>> ? ? ?|| Answer <- inet_dns:msg(Msg, anlist)].
> >>>
> >>> The workaround we used was to call it indirectly with this function, I
> >>> couldn't find anything in OTP that did the same thing that didn't have
> >>> local call optimizations.
> >>>
> >>> %% @spec process_apply(atom(), atom(), [term()]) -> term()
> >>> %% @doc erlang:apply(M, F, A) in a temporary process and return the results.
> >>> process_apply(M,F,A) ->
> >>> ? ? %% We can't just use rpc here because there's a local call optimization.
> >>> ? ? Parent = self(),
> >>> ? ? Fun = fun () ->
> >>> ? ? ? ? ? ? ? ? ? try
> >>> ? ? ? ? ? ? ? ? ? ? ? Parent ! {self(), erlang:apply(M, F, A)}
> >>> ? ? ? ? ? ? ? ? ? catch
> >>> ? ? ? ? ? ? ? ? ? ? ? Class:Reason ->
> >>> ? ? ? ? ? ? ? ? ? ? ? ? ? Stacktrace = erlang:get_stacktrace(),
> >>> ? ? ? ? ? ? ? ? ? ? ? ? ? Parent ! {self(), Class, Reason, Stacktrace}
> >>> ? ? ? ? ? ? ? ? ? end
> >>> ? ? ? ? ? end,
> >>> ? ? {Pid, Ref} = erlang:spawn_monitor(Fun),
> >>> ? ? receive
> >>> ? ? ? ? {Pid, Res} ->
> >>> ? ? ? ? ? ? receive {'DOWN', Ref, process, Pid, _} -> ok end,
> >>> ? ? ? ? ? ? Res;
> >>> ? ? ? ? {Pid, Class, Reason, Stacktrace} ->
> >>> ? ? ? ? ? ? receive {'DOWN', Ref, process, Pid, _} -> ok end,
> >>> ? ? ? ? ? ? erlang:error(erlang:raise(Class, Reason, Stacktrace));
> >>> ? ? ? ? {'DOWN', Ref, process, Pid, Reason} ->
> >>> ? ? ? ? ? ? erlang:exit(Reason)
> >>> ? ? end.
> >>>
> >>> ________________________________________________________________
> >>> erlang-bugs (at) erlang.org mailing list.
> >>> See http://www.erlang.org/faq.html
> >>> To unsubscribe; mailto:erlang-bugs-unsubscribe@REDACTED
> >>
> >> --
> >>
> >> / Raimo Niskanen, Erlang/OTP, Ericsson AB
> >>
> >
> 
> ________________________________________________________________
> erlang-bugs (at) erlang.org mailing list.
> See http://www.erlang.org/faq.html
> To unsubscribe; mailto:erlang-bugs-unsubscribe@REDACTED
> 

-- 

/ Raimo Niskanen, Erlang/OTP, Ericsson AB

From raimo+erlang-bugs@REDACTED  Thu May 27 12:02:48 2010
From: raimo+erlang-bugs@REDACTED (Raimo Niskanen)
Date: Thu, 27 May 2010 12:02:48 +0200
Subject: [erlang-bugs] R13B04 inet_res:resolve/4 inet_udp Port leak
In-Reply-To: <20100527074354.GA5584@erix.ericsson.se>
References: <AANLkTiktTZZDafs3DX9agszbBrmNNOs_o5fQPNWLx0I3@mail.gmail.com> <20100526090626.GA17931@erix.ericsson.se> <AANLkTiltmf6LQCTE98lkh-c5sVpFxkjh5w6YKeK9Qf-m@mail.gmail.com> <AANLkTinr4UpZ2Zpb4XMe90Z0hnkYKcuRL_x3m-i1THqy@mail.gmail.com> <20100527074354.GA5584@erix.ericsson.se>
Message-ID: <20100527100248.GA3917@erix.ericsson.se>

I have created a fix for these problems:
    git fetch git://github.com/RaimoNiskanen/otp.git rn/resolver-leaking-ports

It will be included in 'pu'.

Unfortunately, the second commit eliminates the bug trigger
for what the first commit fixes. So to test if the bug fix
is fixing the bug, one should apply the first commit only.

On Thu, May 27, 2010 at 09:43:54AM +0200, Raimo Niskanen wrote:
> On Wed, May 26, 2010 at 11:59:26AM -0700, Bob Ippolito wrote:
> > Here's the DNS packet that is being received as a response to the query:
> > 
> > 1> inet_dns:decode(<<0,1,131,128,0,1,0,60,0,0,0,0,8,109,111,99,104,105,115,118,110,
> > 3,101,114,108,10,109,111,99,104,105,109,101,100,105,97,3,110,
> > 101,116,0,0,1,0,1>>).
> > {error,fmt}
> 
> Thank you very much! I was about to give you detailed instructions
> about how to dig that up :-)
> 
> I have spotted the problem.
> 
> The DNS reply packet has got the TC (TrunCation) bit set and claims to contain
> 60 answer records, but actually contains zero. inet_dns expects to find
> 60 answer records if it says so. This is a hazy part of the
> DNS specifications and the resolver I tested truncation on did
> not do this kind of self-contradiction, but it _may_ be allowed
> by the specification...
> 
> I regard it as a bug (or at least need-to-fix-problem) in inet_dns
> since it should be real-world compatible not just specification compatible.
> It should allow record shortage in a section if the TC bit is set.
> I'll try to fix it in R14A.
> 
> Can you try my patch adding a missing udp_close(S) to see
> if it stops the leaking port problem? That is a more serious bug.
> 
> > 
> > On Wed, May 26, 2010 at 8:02 AM, Bob Ippolito <bob@REDACTED> wrote:
> > > Well, I'm not sure exactly which scenario is happening because I
> > > haven't looked at the packets yet, but the manual TCP retry is
> > > required.
> > >
> > > mochi@REDACTED:~$ /mochi/opt/erlang-R13B04/bin/erl
> > > Erlang R13B04 (erts-5.7.5) [source] [64-bit] [smp:8:8] [rq:8]
> > > [async-threads:4] [hipe] [kernel-poll:true]
> > >
> > > Eshell V5.7.5 ?(abort with ^G)
> > > 1> lists:filter(fun erlang:is_port/1, element(2,
> > > erlang:process_info(self(), links))).
> > > []
> > > 2> inet_res:resolve("mochisvn.erl.mochimedia.net", in, a, []).
> > > {error,timeout}
> > > 3> lists:filter(fun erlang:is_port/1, element(2,
> > > erlang:process_info(self(), links))).
> > > [#Port<0.514>]
> > > 4> element(1, inet_res:resolve("mochisvn.erl.mochimedia.net", in, a, [usevc])).
> > > ok
> > >
> > >
> > > On Wed, May 26, 2010 at 2:06 AM, Raimo Niskanen
> > > <raimo+erlang-bugs@REDACTED> wrote:
> > >> By reading the code it seems there is a bug when all nameservers
> > >> return an answer that causes decode errors, or can not be
> > >> contacted (enetunreach or econnrefused); then an
> > >> UDP port (or maybe two; one inet and one inet6) is leaked
> > >> since the inet_res:udp_close/1 is not called.
> > >>
> > >> This should be fixed with:
> > >>
> > >> diff --git a/lib/kernel/src/inet_res.erl b/lib/kernel/src/inet_res.erl
> > >> index 9b9e078..3d38a01 100644
> > >> --- a/lib/kernel/src/inet_res.erl
> > >> +++ b/lib/kernel/src/inet_res.erl
> > >> @@ -592,6 +592,7 @@ query_retries(_Q, _NSs, _Timer, Retry, Retry, S) ->
> > >> ?query_retries(Q, NSs, Timer, Retry, I, S0) ->
> > >> ? ? Num = length(NSs),
> > >> ? ? if Num =:= 0 ->
> > >> + ? ? ? ? ? udp_close(S),
> > >> ? ? ? ? ? ?{error,timeout};
> > >> ? ? ? ?true ->
> > >> ? ? ? ? ? ?case query_nss(Q, NSs, Timer, Retry, I, S0, []) of
> > >>
> > >> This "retry with TCP" trick of yours should really not be necessary
> > >> since inet_res retries with TCP if it gets a truncated UDP answer.
> > >> Have you got some other case when retrying with TCP is essential?
> > >>
> > >> Or, does your DNS server produce a (valid?) result that
> > >> triggers a debug bug in inet_res, causing the decode error,
> > >> triggering the port leak bug, forcing you to retry with TCP?
> > >>
> > >> On Tue, May 25, 2010 at 05:00:39PM -0700, Bob Ippolito wrote:
> > >>> It appears that there may be an inet_udp Port leak in
> > >>> inet_res:resolve/4, our current workaround is to spawn a new process
> > >>> to call this function. We've noticed this primarily for a service that
> > >>> regularly does a UDP DNS query that fails (because the response is too
> > >>> big) and then we retry over TCP.
> > >>>
> > >>> This is what the state of the process looked like when it was leaking ports:
> > >>>
> > >>> (node@REDACTED)1> length(lists:filter(fun erlang:is_port/1, element(2,
> > >>> erlang:process_info(whereis(dns_gen_server), links)))).
> > >>> 577
> > >>> (node@REDACTED)2> lists:usort([erlang:port_info(P, name) || P <-
> > >>> lists:filter(fun erlang:is_port/1, element(2,
> > >>> erlang:process_info(whereis(dns_gen_server), links)))]).
> > >>> [{name,"udp_inet"}]
> > >>>
> > >>> The code looked like this, before the workaround was implemented:
> > >>>
> > >>> %% @spec dns(string()) -> [string()]
> > >>> %% @doc Return the A records (IPv4 IPs) as strings for the given Host name.
> > >>> %% ? ? This may return an empty list if there no A records for this Host name.
> > >>> dns(Host) when is_list(Host) ->
> > >>> ? ? dns(Host, fun inet_res:resolve/4).
> > >>>
> > >>> dns(Host, ResolveFun) ->
> > >>> ? ? case ResolveFun(Host, in, a, []) of
> > >>> ? ? ? ? {ok, Msg} ->
> > >>> ? ? ? ? ? ? ips_for_answers(Msg);
> > >>> ? ? ? ? {error, {nxdomain, _}} ->
> > >>> ? ? ? ? ? ? [];
> > >>> ? ? ? ? {error, timeout} ->
> > >>> ? ? ? ? ? ? %% retry with TCP
> > >>> ? ? ? ? ? ? case ResolveFun(Host, in, a, [{usevc, true}]) of
> > >>> ? ? ? ? ? ? ? ? {ok, Msg} ->
> > >>> ? ? ? ? ? ? ? ? ? ? ips_for_answers(Msg);
> > >>> ? ? ? ? ? ? ? ? {error, {nxdomain, _}} ->
> > >>> ? ? ? ? ? ? ? ? ? ? [];
> > >>> ? ? ? ? ? ? ? ? Error = {error, _} ->
> > >>> ? ? ? ? ? ? ? ? ? ? Error
> > >>> ? ? ? ? ? ? end;
> > >>> ? ? ? ? Error = {error, _} ->
> > >>> ? ? ? ? ? ? Error
> > >>> ? ? end.
> > >>>
> > >>> ips_for_answers(Msg) ->
> > >>> ? ? [inet_parse:ntoa(inet_dns:rr(Answer, data))
> > >>> ? ? ?|| Answer <- inet_dns:msg(Msg, anlist)].
> > >>>
> > >>> The workaround we used was to call it indirectly with this function, I
> > >>> couldn't find anything in OTP that did the same thing that didn't have
> > >>> local call optimizations.
> > >>>
> > >>> %% @spec process_apply(atom(), atom(), [term()]) -> term()
> > >>> %% @doc erlang:apply(M, F, A) in a temporary process and return the results.
> > >>> process_apply(M,F,A) ->
> > >>> ? ? %% We can't just use rpc here because there's a local call optimization.
> > >>> ? ? Parent = self(),
> > >>> ? ? Fun = fun () ->
> > >>> ? ? ? ? ? ? ? ? ? try
> > >>> ? ? ? ? ? ? ? ? ? ? ? Parent ! {self(), erlang:apply(M, F, A)}
> > >>> ? ? ? ? ? ? ? ? ? catch
> > >>> ? ? ? ? ? ? ? ? ? ? ? Class:Reason ->
> > >>> ? ? ? ? ? ? ? ? ? ? ? ? ? Stacktrace = erlang:get_stacktrace(),
> > >>> ? ? ? ? ? ? ? ? ? ? ? ? ? Parent ! {self(), Class, Reason, Stacktrace}
> > >>> ? ? ? ? ? ? ? ? ? end
> > >>> ? ? ? ? ? end,
> > >>> ? ? {Pid, Ref} = erlang:spawn_monitor(Fun),
> > >>> ? ? receive
> > >>> ? ? ? ? {Pid, Res} ->
> > >>> ? ? ? ? ? ? receive {'DOWN', Ref, process, Pid, _} -> ok end,
> > >>> ? ? ? ? ? ? Res;
> > >>> ? ? ? ? {Pid, Class, Reason, Stacktrace} ->
> > >>> ? ? ? ? ? ? receive {'DOWN', Ref, process, Pid, _} -> ok end,
> > >>> ? ? ? ? ? ? erlang:error(erlang:raise(Class, Reason, Stacktrace));
> > >>> ? ? ? ? {'DOWN', Ref, process, Pid, Reason} ->
> > >>> ? ? ? ? ? ? erlang:exit(Reason)
> > >>> ? ? end.
> > >>>
> > >>> ________________________________________________________________
> > >>> erlang-bugs (at) erlang.org mailing list.
> > >>> See http://www.erlang.org/faq.html
> > >>> To unsubscribe; mailto:erlang-bugs-unsubscribe@REDACTED
> > >>
> > >> --
> > >>
> > >> / Raimo Niskanen, Erlang/OTP, Ericsson AB
> > >>
> > >
> > 
> > ________________________________________________________________
> > erlang-bugs (at) erlang.org mailing list.
> > See http://www.erlang.org/faq.html
> > To unsubscribe; mailto:erlang-bugs-unsubscribe@REDACTED
> > 
> 
> -- 
> 
> / Raimo Niskanen, Erlang/OTP, Ericsson AB
> 
> ________________________________________________________________
> erlang-bugs (at) erlang.org mailing list.
> See http://www.erlang.org/faq.html
> To unsubscribe; mailto:erlang-bugs-unsubscribe@REDACTED

-- 

/ Raimo Niskanen, Erlang/OTP, Ericsson AB

From rickard@REDACTED  Thu May 27 15:37:10 2010
From: rickard@REDACTED (Rickard Green)
Date: Thu, 27 May 2010 15:37:10 +0200
Subject: Problem with function ethr_rwmutex_tryrlock
Message-ID: <4BFE7586.8020102@erlang.org>

> Hi list,
> 
> I think the fallback version of function ethr_rwmutex_tryrlock in
> erts/lib_src/common/ethread.c is not correct. This function should be
> similar with pthread_rwlock_tryrdlock. For pthread_rwlock_tryrdlock, the
> calling thread acquires the read lock if a writer does not hold the lock and
> there are no writers blocked on the lock. But as following code shows,
> ethr_rwmutex_tryrlock doesn't get the lock when there is no waiting writer,
> and acquires the lock when there are waiting writers. Am I right?
> 
>  ethr_rwmutex_tryrlock(ethr_rwmutex *rwmtx)
> {
>     int res;
> #if ETHR_XCHK
>     if (!rwmtx || rwmtx->initialized != ETHR_RWMUTEX_INITIALIZED) {
> ASSERT(0);
> return EINVAL;
>     }
> #endif
>     res = ethr_mutex_trylock__(&rwmtx->mtx);
>     if (res != 0)
> return res;
>     if (!rwmtx->waiting_writers) {
> res = ethr_mutex_unlock__(&rwmtx->mtx);
> if (res == 0)
>     return EBUSY;
> return res;
>     }
>     rwmtx->readers++;
>     return ethr_mutex_unlock__(&rwmtx->mtx);
> }
> 
> Best Regards,
> Jianrong Zhang
> 

Yes, you are right.

      if (!rwmtx->waiting_writers) {

should be

      if (rwmtx->waiting_writers) {

Thanks! It will be fixed in the upcomming release.

Regards,
Rickard

-- 
Rickard Green, Erlang/OTP, Ericsson AB.

From bob@REDACTED  Thu May 27 16:11:03 2010
From: bob@REDACTED (Bob Ippolito)
Date: Thu, 27 May 2010 07:11:03 -0700
Subject: [erlang-bugs] R13B04 inet_res:resolve/4 inet_udp Port leak
In-Reply-To: <20100527100248.GA3917@erix.ericsson.se>
References: <AANLkTiktTZZDafs3DX9agszbBrmNNOs_o5fQPNWLx0I3@mail.gmail.com>
	<20100526090626.GA17931@erix.ericsson.se>
	<AANLkTiltmf6LQCTE98lkh-c5sVpFxkjh5w6YKeK9Qf-m@mail.gmail.com>
	<AANLkTinr4UpZ2Zpb4XMe90Z0hnkYKcuRL_x3m-i1THqy@mail.gmail.com>
	<20100527074354.GA5584@erix.ericsson.se>
	<20100527100248.GA3917@erix.ericsson.se>
Message-ID: <AANLkTinxcSvt3eMH4nYwfJNdIHMEWXbw7PJIwzPNQVkj@mail.gmail.com>

I can confirm that the first commit fixes the port leak bug.

On Thu, May 27, 2010 at 3:02 AM, Raimo Niskanen
<raimo+erlang-bugs@REDACTED> wrote:
> I have created a fix for these problems:
> ? ?git fetch git://github.com/RaimoNiskanen/otp.git rn/resolver-leaking-ports
>
> It will be included in 'pu'.
>
> Unfortunately, the second commit eliminates the bug trigger
> for what the first commit fixes. So to test if the bug fix
> is fixing the bug, one should apply the first commit only.
>
> On Thu, May 27, 2010 at 09:43:54AM +0200, Raimo Niskanen wrote:
>> On Wed, May 26, 2010 at 11:59:26AM -0700, Bob Ippolito wrote:
>> > Here's the DNS packet that is being received as a response to the query:
>> >
>> > 1> inet_dns:decode(<<0,1,131,128,0,1,0,60,0,0,0,0,8,109,111,99,104,105,115,118,110,
>> > 3,101,114,108,10,109,111,99,104,105,109,101,100,105,97,3,110,
>> > 101,116,0,0,1,0,1>>).
>> > {error,fmt}
>>
>> Thank you very much! I was about to give you detailed instructions
>> about how to dig that up :-)
>>
>> I have spotted the problem.
>>
>> The DNS reply packet has got the TC (TrunCation) bit set and claims to contain
>> 60 answer records, but actually contains zero. inet_dns expects to find
>> 60 answer records if it says so. This is a hazy part of the
>> DNS specifications and the resolver I tested truncation on did
>> not do this kind of self-contradiction, but it _may_ be allowed
>> by the specification...
>>
>> I regard it as a bug (or at least need-to-fix-problem) in inet_dns
>> since it should be real-world compatible not just specification compatible.
>> It should allow record shortage in a section if the TC bit is set.
>> I'll try to fix it in R14A.
>>
>> Can you try my patch adding a missing udp_close(S) to see
>> if it stops the leaking port problem? That is a more serious bug.
>>
>> >
>> > On Wed, May 26, 2010 at 8:02 AM, Bob Ippolito <bob@REDACTED> wrote:
>> > > Well, I'm not sure exactly which scenario is happening because I
>> > > haven't looked at the packets yet, but the manual TCP retry is
>> > > required.
>> > >
>> > > mochi@REDACTED:~$ /mochi/opt/erlang-R13B04/bin/erl
>> > > Erlang R13B04 (erts-5.7.5) [source] [64-bit] [smp:8:8] [rq:8]
>> > > [async-threads:4] [hipe] [kernel-poll:true]
>> > >
>> > > Eshell V5.7.5 ?(abort with ^G)
>> > > 1> lists:filter(fun erlang:is_port/1, element(2,
>> > > erlang:process_info(self(), links))).
>> > > []
>> > > 2> inet_res:resolve("mochisvn.erl.mochimedia.net", in, a, []).
>> > > {error,timeout}
>> > > 3> lists:filter(fun erlang:is_port/1, element(2,
>> > > erlang:process_info(self(), links))).
>> > > [#Port<0.514>]
>> > > 4> element(1, inet_res:resolve("mochisvn.erl.mochimedia.net", in, a, [usevc])).
>> > > ok
>> > >
>> > >
>> > > On Wed, May 26, 2010 at 2:06 AM, Raimo Niskanen
>> > > <raimo+erlang-bugs@REDACTED> wrote:
>> > >> By reading the code it seems there is a bug when all nameservers
>> > >> return an answer that causes decode errors, or can not be
>> > >> contacted (enetunreach or econnrefused); then an
>> > >> UDP port (or maybe two; one inet and one inet6) is leaked
>> > >> since the inet_res:udp_close/1 is not called.
>> > >>
>> > >> This should be fixed with:
>> > >>
>> > >> diff --git a/lib/kernel/src/inet_res.erl b/lib/kernel/src/inet_res.erl
>> > >> index 9b9e078..3d38a01 100644
>> > >> --- a/lib/kernel/src/inet_res.erl
>> > >> +++ b/lib/kernel/src/inet_res.erl
>> > >> @@ -592,6 +592,7 @@ query_retries(_Q, _NSs, _Timer, Retry, Retry, S) ->
>> > >> ?query_retries(Q, NSs, Timer, Retry, I, S0) ->
>> > >> ? ? Num = length(NSs),
>> > >> ? ? if Num =:= 0 ->
>> > >> + ? ? ? ? ? udp_close(S),
>> > >> ? ? ? ? ? ?{error,timeout};
>> > >> ? ? ? ?true ->
>> > >> ? ? ? ? ? ?case query_nss(Q, NSs, Timer, Retry, I, S0, []) of
>> > >>
>> > >> This "retry with TCP" trick of yours should really not be necessary
>> > >> since inet_res retries with TCP if it gets a truncated UDP answer.
>> > >> Have you got some other case when retrying with TCP is essential?
>> > >>
>> > >> Or, does your DNS server produce a (valid?) result that
>> > >> triggers a debug bug in inet_res, causing the decode error,
>> > >> triggering the port leak bug, forcing you to retry with TCP?
>> > >>
>> > >> On Tue, May 25, 2010 at 05:00:39PM -0700, Bob Ippolito wrote:
>> > >>> It appears that there may be an inet_udp Port leak in
>> > >>> inet_res:resolve/4, our current workaround is to spawn a new process
>> > >>> to call this function. We've noticed this primarily for a service that
>> > >>> regularly does a UDP DNS query that fails (because the response is too
>> > >>> big) and then we retry over TCP.
>> > >>>
>> > >>> This is what the state of the process looked like when it was leaking ports:
>> > >>>
>> > >>> (node@REDACTED)1> length(lists:filter(fun erlang:is_port/1, element(2,
>> > >>> erlang:process_info(whereis(dns_gen_server), links)))).
>> > >>> 577
>> > >>> (node@REDACTED)2> lists:usort([erlang:port_info(P, name) || P <-
>> > >>> lists:filter(fun erlang:is_port/1, element(2,
>> > >>> erlang:process_info(whereis(dns_gen_server), links)))]).
>> > >>> [{name,"udp_inet"}]
>> > >>>
>> > >>> The code looked like this, before the workaround was implemented:
>> > >>>
>> > >>> %% @spec dns(string()) -> [string()]
>> > >>> %% @doc Return the A records (IPv4 IPs) as strings for the given Host name.
>> > >>> %% ? ? This may return an empty list if there no A records for this Host name.
>> > >>> dns(Host) when is_list(Host) ->
>> > >>> ? ? dns(Host, fun inet_res:resolve/4).
>> > >>>
>> > >>> dns(Host, ResolveFun) ->
>> > >>> ? ? case ResolveFun(Host, in, a, []) of
>> > >>> ? ? ? ? {ok, Msg} ->
>> > >>> ? ? ? ? ? ? ips_for_answers(Msg);
>> > >>> ? ? ? ? {error, {nxdomain, _}} ->
>> > >>> ? ? ? ? ? ? [];
>> > >>> ? ? ? ? {error, timeout} ->
>> > >>> ? ? ? ? ? ? %% retry with TCP
>> > >>> ? ? ? ? ? ? case ResolveFun(Host, in, a, [{usevc, true}]) of
>> > >>> ? ? ? ? ? ? ? ? {ok, Msg} ->
>> > >>> ? ? ? ? ? ? ? ? ? ? ips_for_answers(Msg);
>> > >>> ? ? ? ? ? ? ? ? {error, {nxdomain, _}} ->
>> > >>> ? ? ? ? ? ? ? ? ? ? [];
>> > >>> ? ? ? ? ? ? ? ? Error = {error, _} ->
>> > >>> ? ? ? ? ? ? ? ? ? ? Error
>> > >>> ? ? ? ? ? ? end;
>> > >>> ? ? ? ? Error = {error, _} ->
>> > >>> ? ? ? ? ? ? Error
>> > >>> ? ? end.
>> > >>>
>> > >>> ips_for_answers(Msg) ->
>> > >>> ? ? [inet_parse:ntoa(inet_dns:rr(Answer, data))
>> > >>> ? ? ?|| Answer <- inet_dns:msg(Msg, anlist)].
>> > >>>
>> > >>> The workaround we used was to call it indirectly with this function, I
>> > >>> couldn't find anything in OTP that did the same thing that didn't have
>> > >>> local call optimizations.
>> > >>>
>> > >>> %% @spec process_apply(atom(), atom(), [term()]) -> term()
>> > >>> %% @doc erlang:apply(M, F, A) in a temporary process and return the results.
>> > >>> process_apply(M,F,A) ->
>> > >>> ? ? %% We can't just use rpc here because there's a local call optimization.
>> > >>> ? ? Parent = self(),
>> > >>> ? ? Fun = fun () ->
>> > >>> ? ? ? ? ? ? ? ? ? try
>> > >>> ? ? ? ? ? ? ? ? ? ? ? Parent ! {self(), erlang:apply(M, F, A)}
>> > >>> ? ? ? ? ? ? ? ? ? catch
>> > >>> ? ? ? ? ? ? ? ? ? ? ? Class:Reason ->
>> > >>> ? ? ? ? ? ? ? ? ? ? ? ? ? Stacktrace = erlang:get_stacktrace(),
>> > >>> ? ? ? ? ? ? ? ? ? ? ? ? ? Parent ! {self(), Class, Reason, Stacktrace}
>> > >>> ? ? ? ? ? ? ? ? ? end
>> > >>> ? ? ? ? ? end,
>> > >>> ? ? {Pid, Ref} = erlang:spawn_monitor(Fun),
>> > >>> ? ? receive
>> > >>> ? ? ? ? {Pid, Res} ->
>> > >>> ? ? ? ? ? ? receive {'DOWN', Ref, process, Pid, _} -> ok end,
>> > >>> ? ? ? ? ? ? Res;
>> > >>> ? ? ? ? {Pid, Class, Reason, Stacktrace} ->
>> > >>> ? ? ? ? ? ? receive {'DOWN', Ref, process, Pid, _} -> ok end,
>> > >>> ? ? ? ? ? ? erlang:error(erlang:raise(Class, Reason, Stacktrace));
>> > >>> ? ? ? ? {'DOWN', Ref, process, Pid, Reason} ->
>> > >>> ? ? ? ? ? ? erlang:exit(Reason)
>> > >>> ? ? end.
>> > >>>
>> > >>> ________________________________________________________________
>> > >>> erlang-bugs (at) erlang.org mailing list.
>> > >>> See http://www.erlang.org/faq.html
>> > >>> To unsubscribe; mailto:erlang-bugs-unsubscribe@REDACTED
>> > >>
>> > >> --
>> > >>
>> > >> / Raimo Niskanen, Erlang/OTP, Ericsson AB
>> > >>
>> > >
>> >
>> > ________________________________________________________________
>> > erlang-bugs (at) erlang.org mailing list.
>> > See http://www.erlang.org/faq.html
>> > To unsubscribe; mailto:erlang-bugs-unsubscribe@REDACTED
>> >
>>
>> --
>>
>> / Raimo Niskanen, Erlang/OTP, Ericsson AB
>>
>> ________________________________________________________________
>> erlang-bugs (at) erlang.org mailing list.
>> See http://www.erlang.org/faq.html
>> To unsubscribe; mailto:erlang-bugs-unsubscribe@REDACTED
>
> --
>
> / Raimo Niskanen, Erlang/OTP, Ericsson AB
>

From raimo+erlang-bugs@REDACTED  Thu May 27 16:26:20 2010
From: raimo+erlang-bugs@REDACTED (Raimo Niskanen)
Date: Thu, 27 May 2010 16:26:20 +0200
Subject: [erlang-bugs] R13B04 inet_res:resolve/4 inet_udp Port leak
In-Reply-To: <AANLkTinxcSvt3eMH4nYwfJNdIHMEWXbw7PJIwzPNQVkj@mail.gmail.com>
References: <AANLkTiktTZZDafs3DX9agszbBrmNNOs_o5fQPNWLx0I3@mail.gmail.com> <20100526090626.GA17931@erix.ericsson.se> <AANLkTiltmf6LQCTE98lkh-c5sVpFxkjh5w6YKeK9Qf-m@mail.gmail.com> <AANLkTinr4UpZ2Zpb4XMe90Z0hnkYKcuRL_x3m-i1THqy@mail.gmail.com> <20100527074354.GA5584@erix.ericsson.se> <20100527100248.GA3917@erix.ericsson.se> <AANLkTinxcSvt3eMH4nYwfJNdIHMEWXbw7PJIwzPNQVkj@mail.gmail.com>
Message-ID: <20100527142620.GA15167@erix.ericsson.se>

On Thu, May 27, 2010 at 07:11:03AM -0700, Bob Ippolito wrote:
> I can confirm that the first commit fixes the port leak bug.

Great! Then that branch should be complete. The second
commit makes your DNS reply message below decode with
the TC bit set, which should make inet_res retry with
'usevc' internally, obsoleting your wrapper (hopefully).

> 
> On Thu, May 27, 2010 at 3:02 AM, Raimo Niskanen
> <raimo+erlang-bugs@REDACTED> wrote:
> > I have created a fix for these problems:
> > ? ?git fetch git://github.com/RaimoNiskanen/otp.git rn/resolver-leaking-ports
> >
> > It will be included in 'pu'.
> >
> > Unfortunately, the second commit eliminates the bug trigger
> > for what the first commit fixes. So to test if the bug fix
> > is fixing the bug, one should apply the first commit only.
> >
> > On Thu, May 27, 2010 at 09:43:54AM +0200, Raimo Niskanen wrote:
> >> On Wed, May 26, 2010 at 11:59:26AM -0700, Bob Ippolito wrote:
> >> > Here's the DNS packet that is being received as a response to the query:
> >> >
> >> > 1> inet_dns:decode(<<0,1,131,128,0,1,0,60,0,0,0,0,8,109,111,99,104,105,115,118,110,
> >> > 3,101,114,108,10,109,111,99,104,105,109,101,100,105,97,3,110,
> >> > 101,116,0,0,1,0,1>>).
> >> > {error,fmt}
> >>
> >> Thank you very much! I was about to give you detailed instructions
> >> about how to dig that up :-)
> >>
> >> I have spotted the problem.
> >>
> >> The DNS reply packet has got the TC (TrunCation) bit set and claims to contain
> >> 60 answer records, but actually contains zero. inet_dns expects to find
> >> 60 answer records if it says so. This is a hazy part of the
> >> DNS specifications and the resolver I tested truncation on did
> >> not do this kind of self-contradiction, but it _may_ be allowed
> >> by the specification...
> >>
> >> I regard it as a bug (or at least need-to-fix-problem) in inet_dns
> >> since it should be real-world compatible not just specification compatible.
> >> It should allow record shortage in a section if the TC bit is set.
> >> I'll try to fix it in R14A.
> >>
> >> Can you try my patch adding a missing udp_close(S) to see
> >> if it stops the leaking port problem? That is a more serious bug.
> >>
> >> >
> >> > On Wed, May 26, 2010 at 8:02 AM, Bob Ippolito <bob@REDACTED> wrote:
> >> > > Well, I'm not sure exactly which scenario is happening because I
> >> > > haven't looked at the packets yet, but the manual TCP retry is
> >> > > required.
> >> > >
> >> > > mochi@REDACTED:~$ /mochi/opt/erlang-R13B04/bin/erl
> >> > > Erlang R13B04 (erts-5.7.5) [source] [64-bit] [smp:8:8] [rq:8]
> >> > > [async-threads:4] [hipe] [kernel-poll:true]
> >> > >
> >> > > Eshell V5.7.5 ?(abort with ^G)
> >> > > 1> lists:filter(fun erlang:is_port/1, element(2,
> >> > > erlang:process_info(self(), links))).
> >> > > []
> >> > > 2> inet_res:resolve("mochisvn.erl.mochimedia.net", in, a, []).
> >> > > {error,timeout}
> >> > > 3> lists:filter(fun erlang:is_port/1, element(2,
> >> > > erlang:process_info(self(), links))).
> >> > > [#Port<0.514>]
> >> > > 4> element(1, inet_res:resolve("mochisvn.erl.mochimedia.net", in, a, [usevc])).
> >> > > ok
> >> > >
> >> > >
> >> > > On Wed, May 26, 2010 at 2:06 AM, Raimo Niskanen
> >> > > <raimo+erlang-bugs@REDACTED> wrote:
> >> > >> By reading the code it seems there is a bug when all nameservers
> >> > >> return an answer that causes decode errors, or can not be
> >> > >> contacted (enetunreach or econnrefused); then an
> >> > >> UDP port (or maybe two; one inet and one inet6) is leaked
> >> > >> since the inet_res:udp_close/1 is not called.
> >> > >>
> >> > >> This should be fixed with:
> >> > >>
> >> > >> diff --git a/lib/kernel/src/inet_res.erl b/lib/kernel/src/inet_res.erl
> >> > >> index 9b9e078..3d38a01 100644
> >> > >> --- a/lib/kernel/src/inet_res.erl
> >> > >> +++ b/lib/kernel/src/inet_res.erl
> >> > >> @@ -592,6 +592,7 @@ query_retries(_Q, _NSs, _Timer, Retry, Retry, S) ->
> >> > >> ?query_retries(Q, NSs, Timer, Retry, I, S0) ->
> >> > >> ? ? Num = length(NSs),
> >> > >> ? ? if Num =:= 0 ->
> >> > >> + ? ? ? ? ? udp_close(S),
> >> > >> ? ? ? ? ? ?{error,timeout};
> >> > >> ? ? ? ?true ->
> >> > >> ? ? ? ? ? ?case query_nss(Q, NSs, Timer, Retry, I, S0, []) of
> >> > >>
> >> > >> This "retry with TCP" trick of yours should really not be necessary
> >> > >> since inet_res retries with TCP if it gets a truncated UDP answer.
> >> > >> Have you got some other case when retrying with TCP is essential?
> >> > >>
> >> > >> Or, does your DNS server produce a (valid?) result that
> >> > >> triggers a debug bug in inet_res, causing the decode error,
> >> > >> triggering the port leak bug, forcing you to retry with TCP?
> >> > >>
> >> > >> On Tue, May 25, 2010 at 05:00:39PM -0700, Bob Ippolito wrote:
> >> > >>> It appears that there may be an inet_udp Port leak in
> >> > >>> inet_res:resolve/4, our current workaround is to spawn a new process
> >> > >>> to call this function. We've noticed this primarily for a service that
> >> > >>> regularly does a UDP DNS query that fails (because the response is too
> >> > >>> big) and then we retry over TCP.
> >> > >>>
> >> > >>> This is what the state of the process looked like when it was leaking ports:
> >> > >>>
> >> > >>> (node@REDACTED)1> length(lists:filter(fun erlang:is_port/1, element(2,
> >> > >>> erlang:process_info(whereis(dns_gen_server), links)))).
> >> > >>> 577
> >> > >>> (node@REDACTED)2> lists:usort([erlang:port_info(P, name) || P <-
> >> > >>> lists:filter(fun erlang:is_port/1, element(2,
> >> > >>> erlang:process_info(whereis(dns_gen_server), links)))]).
> >> > >>> [{name,"udp_inet"}]
> >> > >>>
> >> > >>> The code looked like this, before the workaround was implemented:
> >> > >>>
> >> > >>> %% @spec dns(string()) -> [string()]
> >> > >>> %% @doc Return the A records (IPv4 IPs) as strings for the given Host name.
> >> > >>> %% ? ? This may return an empty list if there no A records for this Host name.
> >> > >>> dns(Host) when is_list(Host) ->
> >> > >>> ? ? dns(Host, fun inet_res:resolve/4).
> >> > >>>
> >> > >>> dns(Host, ResolveFun) ->
> >> > >>> ? ? case ResolveFun(Host, in, a, []) of
> >> > >>> ? ? ? ? {ok, Msg} ->
> >> > >>> ? ? ? ? ? ? ips_for_answers(Msg);
> >> > >>> ? ? ? ? {error, {nxdomain, _}} ->
> >> > >>> ? ? ? ? ? ? [];
> >> > >>> ? ? ? ? {error, timeout} ->
> >> > >>> ? ? ? ? ? ? %% retry with TCP
> >> > >>> ? ? ? ? ? ? case ResolveFun(Host, in, a, [{usevc, true}]) of
> >> > >>> ? ? ? ? ? ? ? ? {ok, Msg} ->
> >> > >>> ? ? ? ? ? ? ? ? ? ? ips_for_answers(Msg);
> >> > >>> ? ? ? ? ? ? ? ? {error, {nxdomain, _}} ->
> >> > >>> ? ? ? ? ? ? ? ? ? ? [];
> >> > >>> ? ? ? ? ? ? ? ? Error = {error, _} ->
> >> > >>> ? ? ? ? ? ? ? ? ? ? Error
> >> > >>> ? ? ? ? ? ? end;
> >> > >>> ? ? ? ? Error = {error, _} ->
> >> > >>> ? ? ? ? ? ? Error
> >> > >>> ? ? end.
> >> > >>>
> >> > >>> ips_for_answers(Msg) ->
> >> > >>> ? ? [inet_parse:ntoa(inet_dns:rr(Answer, data))
> >> > >>> ? ? ?|| Answer <- inet_dns:msg(Msg, anlist)].
> >> > >>>
> >> > >>> The workaround we used was to call it indirectly with this function, I
> >> > >>> couldn't find anything in OTP that did the same thing that didn't have
> >> > >>> local call optimizations.
> >> > >>>
> >> > >>> %% @spec process_apply(atom(), atom(), [term()]) -> term()
> >> > >>> %% @doc erlang:apply(M, F, A) in a temporary process and return the results.
> >> > >>> process_apply(M,F,A) ->
> >> > >>> ? ? %% We can't just use rpc here because there's a local call optimization.
> >> > >>> ? ? Parent = self(),
> >> > >>> ? ? Fun = fun () ->
> >> > >>> ? ? ? ? ? ? ? ? ? try
> >> > >>> ? ? ? ? ? ? ? ? ? ? ? Parent ! {self(), erlang:apply(M, F, A)}
> >> > >>> ? ? ? ? ? ? ? ? ? catch
> >> > >>> ? ? ? ? ? ? ? ? ? ? ? Class:Reason ->
> >> > >>> ? ? ? ? ? ? ? ? ? ? ? ? ? Stacktrace = erlang:get_stacktrace(),
> >> > >>> ? ? ? ? ? ? ? ? ? ? ? ? ? Parent ! {self(), Class, Reason, Stacktrace}
> >> > >>> ? ? ? ? ? ? ? ? ? end
> >> > >>> ? ? ? ? ? end,
> >> > >>> ? ? {Pid, Ref} = erlang:spawn_monitor(Fun),
> >> > >>> ? ? receive
> >> > >>> ? ? ? ? {Pid, Res} ->
> >> > >>> ? ? ? ? ? ? receive {'DOWN', Ref, process, Pid, _} -> ok end,
> >> > >>> ? ? ? ? ? ? Res;
> >> > >>> ? ? ? ? {Pid, Class, Reason, Stacktrace} ->
> >> > >>> ? ? ? ? ? ? receive {'DOWN', Ref, process, Pid, _} -> ok end,
> >> > >>> ? ? ? ? ? ? erlang:error(erlang:raise(Class, Reason, Stacktrace));
> >> > >>> ? ? ? ? {'DOWN', Ref, process, Pid, Reason} ->
> >> > >>> ? ? ? ? ? ? erlang:exit(Reason)
> >> > >>> ? ? end.
> >> > >>>
> >> > >>> ________________________________________________________________
> >> > >>> erlang-bugs (at) erlang.org mailing list.
> >> > >>> See http://www.erlang.org/faq.html
> >> > >>> To unsubscribe; mailto:erlang-bugs-unsubscribe@REDACTED
> >> > >>
> >> > >> --
> >> > >>
> >> > >> / Raimo Niskanen, Erlang/OTP, Ericsson AB
> >> > >>
> >> > >
> >> >
> >> > ________________________________________________________________
> >> > erlang-bugs (at) erlang.org mailing list.
> >> > See http://www.erlang.org/faq.html
> >> > To unsubscribe; mailto:erlang-bugs-unsubscribe@REDACTED
> >> >
> >>
> >> --
> >>
> >> / Raimo Niskanen, Erlang/OTP, Ericsson AB
> >>
> >> ________________________________________________________________
> >> erlang-bugs (at) erlang.org mailing list.
> >> See http://www.erlang.org/faq.html
> >> To unsubscribe; mailto:erlang-bugs-unsubscribe@REDACTED
> >
> > --
> >
> > / Raimo Niskanen, Erlang/OTP, Ericsson AB
> >
> 
> ________________________________________________________________
> erlang-bugs (at) erlang.org mailing list.
> See http://www.erlang.org/faq.html
> To unsubscribe; mailto:erlang-bugs-unsubscribe@REDACTED
> 

-- 

/ Raimo Niskanen, Erlang/OTP, Ericsson AB

From vlm@REDACTED  Sat May 29 02:49:11 2010
From: vlm@REDACTED (Lev Walkin)
Date: Fri, 28 May 2010 17:49:11 -0700
Subject: http:request memory leak in R12B04
In-Reply-To: <87d3wf69a3.fsf@cronqvi.st>
References: <484B8939-A4BD-4F95-986B-973C82F84D54@lionet.info> <87d3wf69a3.fsf@cronqvi.st>
Message-ID: <34B378AB-7963-4819-8AA8-C81AD913648E@lionet.info>


No idea. I haven't received any follow-up, and the available code base  
on git never changed in this respect, as far as I can tell.

We are running with a custom version on our servers and thinking of  
dropping Erlang in favor of PHP.


On May 28, 2010, at 1:50 PM, mats cronqvist wrote:

> Lev Walkin <vlm@REDACTED> writes:
>
>  what's up with this? renders httpc unusable, as far as I can tell.
>
>  mats
>
>> The R12B04 release brought a reliable memory leak to http:request  
>> that
>> was never in there before.

-- 
vlm


From vlm@REDACTED  Sat May 29 06:46:40 2010
From: vlm@REDACTED (Lev Walkin)
Date: Fri, 28 May 2010 21:46:40 -0700
Subject: [erlang-bugs] Re: http:request memory leak in R12B04
In-Reply-To: <34B378AB-7963-4819-8AA8-C81AD913648E@lionet.info>
References: <484B8939-A4BD-4F95-986B-973C82F84D54@lionet.info> <87d3wf69a3.fsf@cronqvi.st> <34B378AB-7963-4819-8AA8-C81AD913648E@lionet.info>
Message-ID: <B4089131-1646-4397-BB8B-CA64A519A42F@lionet.info>


I must correct myself here. Just checked the OTP source code and it  
turned out the bug was finally fixed in inets-5.3.2, roughly in accord  
with the patch we've submitted on March 22.

Here's the proof:

http://github.com/erlang/otp/commit/91c89d54d45989a85367f10d5902b9b508754a49


On May 28, 2010, at 5:49 PM, Lev Walkin wrote:

>
> No idea. I haven't received any follow-up, and the available code  
> base on git never changed in this respect, as far as I can tell.
>
> We are running with a custom version on our servers and thinking of  
> dropping Erlang in favor of PHP.
>
>
> On May 28, 2010, at 1:50 PM, mats cronqvist wrote:
>
>> Lev Walkin <vlm@REDACTED> writes:
>>
>> what's up with this? renders httpc unusable, as far as I can tell.
>>
>> mats
>>
>>> The R12B04 release brought a reliable memory leak to http:request  
>>> that
>>> was never in there before.
>
> -- 
> vlm
>
>
> ________________________________________________________________
> erlang-bugs (at) erlang.org mailing list.
> See http://www.erlang.org/faq.html
> To unsubscribe; mailto:erlang-bugs-unsubscribe@REDACTED
>

-- 
vlm


From kostis@REDACTED  Sat May 29 14:35:59 2010
From: kostis@REDACTED (Kostis Sagonas)
Date: Sat, 29 May 2010 15:35:59 +0300
Subject: Inviso
Message-ID: <4C010A2F.30004@cs.ntua.gr>

Ulf Wiger suggested a cleanup of the 'inviso' application by tidier and 
I've already done that.  I'll submit a patch via github early next week.

Before starting, I've already noticed that dialyzer complains that:

inviso_tool.erl:586: The pattern {'error', Reason} can never match the 
type {'ok',#ld{...}}

which refers to the init/1 function:

init(Config) ->
     case fetch_configuration(Config) of     % From conf-file and Config.
	{ok,#ld{}=LD} ->
	    case start_inviso_at_c_node(LD) of
		...
	    end;
	{error,Reason} ->
	    {stop,{error,{start_up,Reason}}}
     end.

due to the fact that fetch_configuration/1 returns {ok,...} in all its 
branches -- even in the error case:

fetch_configuration(Config) ->
     case fetch_config_filename(Config) of
	{ok,FName} ->              % We are supposed to use a conf-file.
	    case read_config_file(FName) of
		{ok,LD} ->         % Managed to open a file.
		    NewLD=read_config_list(LD,Config),
		    {ok,NewLD};
		{error,_Reason} -> % Problem finding/opening file.
		    LD=read_config_list(#ld{},Config),
		    {ok,LD}
	    end;
	false ->                   % No filename specified.
	    LD=read_config_list(#ld{},Config),
	    {ok,LD}
     end.

Dialyzer is right, but the question is how should this one be fixed? 
Simply by taking the {error,Reason} case from init/1 or by returning 
{error,Reason} in the {error,Reason} case of fetch_configuration/1?

Kostis

From kostis@REDACTED  Sat May 29 20:41:25 2010
From: kostis@REDACTED (Kostis Sagonas)
Date: Sat, 29 May 2010 21:41:25 +0300
Subject: [erlang-bugs] Inviso
In-Reply-To: <4C010A2F.30004@cs.ntua.gr>
References: <4C010A2F.30004@cs.ntua.gr>
Message-ID: <4C015FD5.6070706@cs.ntua.gr>

Some more confusion in inviso.

There is a record definition which reads:

   %% The loopdata record.
   -record(ld,{...
               session_state=passive,            % passive | tracing
	      ...}).

leading one to believe that this field is to be assigned the values 
'passive' or 'tracing'.  Yet, on line 844 there is an assignment:

            LD#ld{session_state=passive_sessionstate(),
                  nodes=NewNodesD,
                  ....

The problem is that the definition of passive_sessionstate/0 reads 
(comment is actually from the code - line 2962):

   %% Returns the correct value indicating that the tool is not tracing.
   passive_sessionstate() ->
       idle.

Does anybody know which are the values that this field can have?

Kostis

From raimo+erlang-bugs@REDACTED  Mon May 31 14:46:13 2010
From: raimo+erlang-bugs@REDACTED (Raimo Niskanen)
Date: Mon, 31 May 2010 14:46:13 +0200
Subject: [erlang-bugs] R13B04 inet_res:resolve/4 inet_udp Port leak
In-Reply-To: <20100527142620.GA15167@erix.ericsson.se>
References: <AANLkTiktTZZDafs3DX9agszbBrmNNOs_o5fQPNWLx0I3@mail.gmail.com> <20100526090626.GA17931@erix.ericsson.se> <AANLkTiltmf6LQCTE98lkh-c5sVpFxkjh5w6YKeK9Qf-m@mail.gmail.com> <AANLkTinr4UpZ2Zpb4XMe90Z0hnkYKcuRL_x3m-i1THqy@mail.gmail.com> <20100527074354.GA5584@erix.ericsson.se> <20100527100248.GA3917@erix.ericsson.se> <AANLkTinxcSvt3eMH4nYwfJNdIHMEWXbw7PJIwzPNQVkj@mail.gmail.com> <20100527142620.GA15167@erix.ericsson.se>
Message-ID: <20100531124613.GA15756@erix.ericsson.se>

On Thu, May 27, 2010 at 04:26:20PM +0200, Raimo Niskanen wrote:
> On Thu, May 27, 2010 at 07:11:03AM -0700, Bob Ippolito wrote:
> > I can confirm that the first commit fixes the port leak bug.
> 
> Great! Then that branch should be complete. The second
> commit makes your DNS reply message below decode with
> the TC bit set, which should make inet_res retry with
> 'usevc' internally, obsoleting your wrapper (hopefully).

4aa2ead3149d3727ec6ad67b653ff51c74405671
New commit. The previous only worked for your special case.
What is not tested does not work...

It will be included in 'pu'.

> 
> > 
> > On Thu, May 27, 2010 at 3:02 AM, Raimo Niskanen
> > <raimo+erlang-bugs@REDACTED> wrote:
> > > I have created a fix for these problems:
> > > ? ?git fetch git://github.com/RaimoNiskanen/otp.git rn/resolver-leaking-ports
> > >
> > > It will be included in 'pu'.
> > >
> > > Unfortunately, the second commit eliminates the bug trigger
> > > for what the first commit fixes. So to test if the bug fix
> > > is fixing the bug, one should apply the first commit only.
> > >
> > > On Thu, May 27, 2010 at 09:43:54AM +0200, Raimo Niskanen wrote:
> > >> On Wed, May 26, 2010 at 11:59:26AM -0700, Bob Ippolito wrote:
> > >> > Here's the DNS packet that is being received as a response to the query:
> > >> >
> > >> > 1> inet_dns:decode(<<0,1,131,128,0,1,0,60,0,0,0,0,8,109,111,99,104,105,115,118,110,
> > >> > 3,101,114,108,10,109,111,99,104,105,109,101,100,105,97,3,110,
> > >> > 101,116,0,0,1,0,1>>).
> > >> > {error,fmt}
> > >>
> > >> Thank you very much! I was about to give you detailed instructions
> > >> about how to dig that up :-)
> > >>
> > >> I have spotted the problem.
> > >>
> > >> The DNS reply packet has got the TC (TrunCation) bit set and claims to contain
> > >> 60 answer records, but actually contains zero. inet_dns expects to find
> > >> 60 answer records if it says so. This is a hazy part of the
> > >> DNS specifications and the resolver I tested truncation on did
> > >> not do this kind of self-contradiction, but it _may_ be allowed
> > >> by the specification...
> > >>
> > >> I regard it as a bug (or at least need-to-fix-problem) in inet_dns
> > >> since it should be real-world compatible not just specification compatible.
> > >> It should allow record shortage in a section if the TC bit is set.
> > >> I'll try to fix it in R14A.
> > >>
> > >> Can you try my patch adding a missing udp_close(S) to see
> > >> if it stops the leaking port problem? That is a more serious bug.
> > >>
> > >> >
> > >> > On Wed, May 26, 2010 at 8:02 AM, Bob Ippolito <bob@REDACTED> wrote:
> > >> > > Well, I'm not sure exactly which scenario is happening because I
> > >> > > haven't looked at the packets yet, but the manual TCP retry is
> > >> > > required.
> > >> > >
> > >> > > mochi@REDACTED:~$ /mochi/opt/erlang-R13B04/bin/erl
> > >> > > Erlang R13B04 (erts-5.7.5) [source] [64-bit] [smp:8:8] [rq:8]
> > >> > > [async-threads:4] [hipe] [kernel-poll:true]
> > >> > >
> > >> > > Eshell V5.7.5 ?(abort with ^G)
> > >> > > 1> lists:filter(fun erlang:is_port/1, element(2,
> > >> > > erlang:process_info(self(), links))).
> > >> > > []
> > >> > > 2> inet_res:resolve("mochisvn.erl.mochimedia.net", in, a, []).
> > >> > > {error,timeout}
> > >> > > 3> lists:filter(fun erlang:is_port/1, element(2,
> > >> > > erlang:process_info(self(), links))).
> > >> > > [#Port<0.514>]
> > >> > > 4> element(1, inet_res:resolve("mochisvn.erl.mochimedia.net", in, a, [usevc])).
> > >> > > ok
> > >> > >
> > >> > >
> > >> > > On Wed, May 26, 2010 at 2:06 AM, Raimo Niskanen
> > >> > > <raimo+erlang-bugs@REDACTED> wrote:
> > >> > >> By reading the code it seems there is a bug when all nameservers
> > >> > >> return an answer that causes decode errors, or can not be
> > >> > >> contacted (enetunreach or econnrefused); then an
> > >> > >> UDP port (or maybe two; one inet and one inet6) is leaked
> > >> > >> since the inet_res:udp_close/1 is not called.
> > >> > >>
> > >> > >> This should be fixed with:
> > >> > >>
> > >> > >> diff --git a/lib/kernel/src/inet_res.erl b/lib/kernel/src/inet_res.erl
> > >> > >> index 9b9e078..3d38a01 100644
> > >> > >> --- a/lib/kernel/src/inet_res.erl
> > >> > >> +++ b/lib/kernel/src/inet_res.erl
> > >> > >> @@ -592,6 +592,7 @@ query_retries(_Q, _NSs, _Timer, Retry, Retry, S) ->
> > >> > >> ?query_retries(Q, NSs, Timer, Retry, I, S0) ->
> > >> > >> ? ? Num = length(NSs),
> > >> > >> ? ? if Num =:= 0 ->
> > >> > >> + ? ? ? ? ? udp_close(S),
> > >> > >> ? ? ? ? ? ?{error,timeout};
> > >> > >> ? ? ? ?true ->
> > >> > >> ? ? ? ? ? ?case query_nss(Q, NSs, Timer, Retry, I, S0, []) of
> > >> > >>
> > >> > >> This "retry with TCP" trick of yours should really not be necessary
> > >> > >> since inet_res retries with TCP if it gets a truncated UDP answer.
> > >> > >> Have you got some other case when retrying with TCP is essential?
> > >> > >>
> > >> > >> Or, does your DNS server produce a (valid?) result that
> > >> > >> triggers a debug bug in inet_res, causing the decode error,
> > >> > >> triggering the port leak bug, forcing you to retry with TCP?
> > >> > >>
> > >> > >> On Tue, May 25, 2010 at 05:00:39PM -0700, Bob Ippolito wrote:
> > >> > >>> It appears that there may be an inet_udp Port leak in
> > >> > >>> inet_res:resolve/4, our current workaround is to spawn a new process
> > >> > >>> to call this function. We've noticed this primarily for a service that
> > >> > >>> regularly does a UDP DNS query that fails (because the response is too
> > >> > >>> big) and then we retry over TCP.
> > >> > >>>
> > >> > >>> This is what the state of the process looked like when it was leaking ports:
> > >> > >>>
> > >> > >>> (node@REDACTED)1> length(lists:filter(fun erlang:is_port/1, element(2,
> > >> > >>> erlang:process_info(whereis(dns_gen_server), links)))).
> > >> > >>> 577
> > >> > >>> (node@REDACTED)2> lists:usort([erlang:port_info(P, name) || P <-
> > >> > >>> lists:filter(fun erlang:is_port/1, element(2,
> > >> > >>> erlang:process_info(whereis(dns_gen_server), links)))]).
> > >> > >>> [{name,"udp_inet"}]
> > >> > >>>
> > >> > >>> The code looked like this, before the workaround was implemented:
> > >> > >>>
> > >> > >>> %% @spec dns(string()) -> [string()]
> > >> > >>> %% @doc Return the A records (IPv4 IPs) as strings for the given Host name.
> > >> > >>> %% ? ? This may return an empty list if there no A records for this Host name.
> > >> > >>> dns(Host) when is_list(Host) ->
> > >> > >>> ? ? dns(Host, fun inet_res:resolve/4).
> > >> > >>>
> > >> > >>> dns(Host, ResolveFun) ->
> > >> > >>> ? ? case ResolveFun(Host, in, a, []) of
> > >> > >>> ? ? ? ? {ok, Msg} ->
> > >> > >>> ? ? ? ? ? ? ips_for_answers(Msg);
> > >> > >>> ? ? ? ? {error, {nxdomain, _}} ->
> > >> > >>> ? ? ? ? ? ? [];
> > >> > >>> ? ? ? ? {error, timeout} ->
> > >> > >>> ? ? ? ? ? ? %% retry with TCP
> > >> > >>> ? ? ? ? ? ? case ResolveFun(Host, in, a, [{usevc, true}]) of
> > >> > >>> ? ? ? ? ? ? ? ? {ok, Msg} ->
> > >> > >>> ? ? ? ? ? ? ? ? ? ? ips_for_answers(Msg);
> > >> > >>> ? ? ? ? ? ? ? ? {error, {nxdomain, _}} ->
> > >> > >>> ? ? ? ? ? ? ? ? ? ? [];
> > >> > >>> ? ? ? ? ? ? ? ? Error = {error, _} ->
> > >> > >>> ? ? ? ? ? ? ? ? ? ? Error
> > >> > >>> ? ? ? ? ? ? end;
> > >> > >>> ? ? ? ? Error = {error, _} ->
> > >> > >>> ? ? ? ? ? ? Error
> > >> > >>> ? ? end.
> > >> > >>>
> > >> > >>> ips_for_answers(Msg) ->
> > >> > >>> ? ? [inet_parse:ntoa(inet_dns:rr(Answer, data))
> > >> > >>> ? ? ?|| Answer <- inet_dns:msg(Msg, anlist)].
> > >> > >>>
> > >> > >>> The workaround we used was to call it indirectly with this function, I
> > >> > >>> couldn't find anything in OTP that did the same thing that didn't have
> > >> > >>> local call optimizations.
> > >> > >>>
> > >> > >>> %% @spec process_apply(atom(), atom(), [term()]) -> term()
> > >> > >>> %% @doc erlang:apply(M, F, A) in a temporary process and return the results.
> > >> > >>> process_apply(M,F,A) ->
> > >> > >>> ? ? %% We can't just use rpc here because there's a local call optimization.
> > >> > >>> ? ? Parent = self(),
> > >> > >>> ? ? Fun = fun () ->
> > >> > >>> ? ? ? ? ? ? ? ? ? try
> > >> > >>> ? ? ? ? ? ? ? ? ? ? ? Parent ! {self(), erlang:apply(M, F, A)}
> > >> > >>> ? ? ? ? ? ? ? ? ? catch
> > >> > >>> ? ? ? ? ? ? ? ? ? ? ? Class:Reason ->
> > >> > >>> ? ? ? ? ? ? ? ? ? ? ? ? ? Stacktrace = erlang:get_stacktrace(),
> > >> > >>> ? ? ? ? ? ? ? ? ? ? ? ? ? Parent ! {self(), Class, Reason, Stacktrace}
> > >> > >>> ? ? ? ? ? ? ? ? ? end
> > >> > >>> ? ? ? ? ? end,
> > >> > >>> ? ? {Pid, Ref} = erlang:spawn_monitor(Fun),
> > >> > >>> ? ? receive
> > >> > >>> ? ? ? ? {Pid, Res} ->
> > >> > >>> ? ? ? ? ? ? receive {'DOWN', Ref, process, Pid, _} -> ok end,
> > >> > >>> ? ? ? ? ? ? Res;
> > >> > >>> ? ? ? ? {Pid, Class, Reason, Stacktrace} ->
> > >> > >>> ? ? ? ? ? ? receive {'DOWN', Ref, process, Pid, _} -> ok end,
> > >> > >>> ? ? ? ? ? ? erlang:error(erlang:raise(Class, Reason, Stacktrace));
> > >> > >>> ? ? ? ? {'DOWN', Ref, process, Pid, Reason} ->
> > >> > >>> ? ? ? ? ? ? erlang:exit(Reason)
> > >> > >>> ? ? end.
> > >> > >>>
> > >> > >>> ________________________________________________________________
> > >> > >>> erlang-bugs (at) erlang.org mailing list.
> > >> > >>> See http://www.erlang.org/faq.html
> > >> > >>> To unsubscribe; mailto:erlang-bugs-unsubscribe@REDACTED
> > >> > >>
> > >> > >> --
> > >> > >>
> > >> > >> / Raimo Niskanen, Erlang/OTP, Ericsson AB
> > >> > >>
> > >> > >
> > >> >
> > >> > ________________________________________________________________
> > >> > erlang-bugs (at) erlang.org mailing list.
> > >> > See http://www.erlang.org/faq.html
> > >> > To unsubscribe; mailto:erlang-bugs-unsubscribe@REDACTED
> > >> >
> > >>
> > >> --
> > >>
> > >> / Raimo Niskanen, Erlang/OTP, Ericsson AB
> > >>
> > >> ________________________________________________________________
> > >> erlang-bugs (at) erlang.org mailing list.
> > >> See http://www.erlang.org/faq.html
> > >> To unsubscribe; mailto:erlang-bugs-unsubscribe@REDACTED
> > >
> > > --
> > >
> > > / Raimo Niskanen, Erlang/OTP, Ericsson AB
> > >
> > 
> > ________________________________________________________________
> > erlang-bugs (at) erlang.org mailing list.
> > See http://www.erlang.org/faq.html
> > To unsubscribe; mailto:erlang-bugs-unsubscribe@REDACTED
> > 
> 
> -- 
> 
> / Raimo Niskanen, Erlang/OTP, Ericsson AB
> 
> ________________________________________________________________
> erlang-bugs (at) erlang.org mailing list.
> See http://www.erlang.org/faq.html
> To unsubscribe; mailto:erlang-bugs-unsubscribe@REDACTED

-- 

/ Raimo Niskanen, Erlang/OTP, Ericsson AB