From vlm@REDACTED Tue Sep 2 05:35:23 2008 From: vlm@REDACTED (Lev Walkin) Date: Mon, 01 Sep 2008 20:35:23 -0700 Subject: [erlang-bugs] R12B-3: string:to_integer() sporadic failures Message-ID: <48BCB47B.3040503@lionet.info> Hi, we all love string to integer conversion routines, such as string:to_integer/1 or erlang:list_to_integer/1. The functions serve us well and indeed provide us with advertised functionality almost every time. However, we've noticed some oddity on our production system which, after two weeks of jaw-dropping musings, has boiled down to a strange idempotence violation in the string:to_integer/1 implementation. To those of you impatient enough, please look into the code attached and try running it as nfail:test(10000) and go down from there. Here's that string:to_integer/1 function, having the following signature: string:to_integer(String) -> {Int,Rest} | {error,Reason} Here's a typical invocation resulting in a number and the rest of the unparsed string returned: 36> string:to_integer("134217728,\n"). {134217728,",\n"} 37> What would happen if we did it once more? 37> string:to_integer("134217728,\n"). {134217728,",\n"} 38> By this time we can be reasonably sure that string:to_integer/1 will return the same output given the same input. We have seen that this is indeed the case by testing it twice. Could it be that testing it N times would result in a bad behavior? Nah, unlikely, you say. If you haven't looked at the attached code, it is time to do so. In the code, we create a list of N results of the string:to_integer/1 application, like this: iterate(0, Acc) -> Acc; iterate(N, Acc) -> iterate(N - 1, [case string:to_integer("134217728,\n") of {Int, _} -> Int end | Acc]). This code utilizes tail recursion with an accumulator list which gets prepended N times by string:to_integer/1 output, undoubtedly an integer. (Implementation with a map over lists:seq() output can do as well). So we spawn and run this iterate/2 function, appropriately checking that the list consists only of integers with value 134217728: test(N) -> {Self, Ref} = {self(), make_ref()}, spawn_opt(fun()-> L = iterate(N, []), % Filter out non-conforming entries BadList = [X || X <- L, X =/= 134217728], BadLen = length(BadList), % Here, BadLen should always be 0! io:format("~b bad conversions: ~p~n", [BadLen, BadList]), Self ! {done, Ref, BadLen} end, [link,{fullsweep_after,0}]), receive {done, Ref, Len} -> Len end. Based on the smoke tests above, this code should always result in something like this: 38> nfail:test(100). 0 bad conversions: [] 0 39> But let's start testing it thorougly, as if not believing ourselves that such a simple function could possibly fail at times: [vlm@REDACTED:~]> erl Erlang (BEAM) emulator version 5.6.3 [source] [async-threads:0] [kernel-poll:false] Eshell V5.6.3 (abort with ^G) 1> c(nfail). {ok,nfail} 2> nfail:test(1). 0 bad conversions: [] 0 3> nfail:test(2). 0 bad conversions: [] 0 4> nfail:test(100). 0 bad conversions: [] 0 5> nfail:test(1000). 2 bad conversions: [{134217728,",\n"},{134217728,",\n"}] 2 6> nfail:test(10000). 5 bad conversions: [{134217728,",\n"}, {134217728,",\n"}, {134217728,",\n"}, {134217728,",\n"}, {134217728,",\n"}] 5 7> ... 32> nfail:test(810). 0 bad conversions: [] 0 33> nfail:test(811). 2 bad conversions: [{134217728,",\n"},{134217728,",\n"}] 2 34> nfail:test(810). 0 bad conversions: [] 0 35> nfail:test(811). 2 bad conversions: [{134217728,",\n"},{134217728,",\n"}] 2 36> As you see, string:to_integer/1 consistently generates bad entries which contain {134217728,",\n"} instead of 134217728, but only when the number of invocation reaches about 1000 (811 in this particular sequence of steps). Perhaps this is a platform glitch? The above code was being executed on a ppc (G4) Mac OS X 10.5 with R12B-3 (32-bit) built from scratch. Here's what Sun sparc v9 with Solaris 10 thinks about that test case: [vlm@REDACTED:~]> erl Erlang (BEAM) emulator version 5.6.3 [source] [async-threads:0] [hipe] [kernel-poll:false] Eshell V5.6.3 (abort with ^G) 1> c(nfail). {ok,nfail} 2> nfail:test(100). 1 bad conversions: [{134217728,",\n"}] 1 3> nfail:test(1000). 3 bad conversions: [{134217728,",\n"},{134217728,",\n"},{134217728,",\n"}] 3 4> nfail:test(10). 0 bad conversions: [] 0 5> nfail:test(50). 0 bad conversions: [] 0 6> c(nfail, [native]). {ok,nfail} 7> nfail:test(500). 4 bad conversions: [{134217728,",\n"}, {134217728,",\n"}, {134217728,",\n"}, {134217728,",\n"}] 4 8> And here's what 64-bit Intel box (am64, FreeBSD 6.3) thinks: [vlm@REDACTED ~]$ erl Erlang (BEAM) emulator version 5.6.3 [source] [64-bit] [async-threads:0] [hipe] [kernel-poll:false] Eshell V5.6.3 (abort with ^G) 1> c(nfail). {ok,nfail} 2> nfail:test(100). 0 bad conversions: [] 0 3> nfail:test(1000). 0 bad conversions: [] 0 4> nfail:test(10000). 0 bad conversions: [] 0 5> nfail:test(100000). 0 bad conversions: [] 0 6> nfail:test(1000000). 0 bad conversions: [] 0 7> Oops, it went very well, suprisingly. Perhaps, it is a purely non-Intel chip problem? Let's try it on a non 64bit machine, such as Pentium D under Microsoft Windows Vista? in 32-bit mode: Erlang (BEAM) emulator version 5.6.3 [smp:2] [async-threads:0] Eshell V5.6.3 (abort with ^G) 1> c('c:/tmp/nfail.erl'). {ok,nfail} 2> nfail:test(100). 0 bad conversions: [] 0 3> nfail:test(1000). 2 bad conversions: [{134217728,",\n"},{134217728,",\n"}] 2 4> Aha! See the pattern: 32-bit Erlang installations on many hardware platforms are having very similar problems. They are unable to consistently convert a string containing an integer into an integer value. Incidentally enough, the integer value for which Erlang starts to misbehave is 134217728, which is 2^27. Trying it with value "134217727" or lower does not create this idempotence problem. Trying R11B-5 under Windows Vista? 32-bit does not exhibit such problem either, so it must be a specific issue to R12. Please advise. P.S. Thanks to Denis Titoruk and Vladimir Serov for investigating this issue and coming up with a short test case. -- Lev Walkin vlm@REDACTED -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: nfail.erl URL: From erlang-questions_efine@REDACTED Tue Sep 2 08:07:00 2008 From: erlang-questions_efine@REDACTED (Edwin Fine) Date: Tue, 2 Sep 2008 02:07:00 -0400 Subject: [erlang-bugs] R12B-3: string:to_integer() sporadic failures In-Reply-To: <48BCB47B.3040503@lionet.info> References: <48BCB47B.3040503@lionet.info> Message-ID: <6c2563b20809012307s3d366e5h2bcb2b65998a7fbd@mail.gmail.com> You did is a great piece of detective work. This really piqued my interest and I did a bit of playing around. It seems that the failures always occur at the same point in the sequence. What I did is add a sequence number to each conversion, then print them out. I conjectured that the failed values are actually all integers but internally incorrectly represented so that the inequality test fails. I also looked at the C code for the BIF for string:to_integer, which was ... interesting. The bottom line is that every time I ran the test code, the conversions always failed at the same points (ran on Windows XP SP3, Erlang R12B-3). I tried varying the erl parameters (turning off SMP, fiddling with memory allocation, etc) but to no avail. 46> nfail2:test(1000000). 6 bad conversions: [{308,{134217728,",\n"}}, {500,{134217728,",\n"}}, {1313,{134217728,",\n"}}, {3589,{134217728,",\n"}}, {5631,{134217728,",\n"}}, {61947,{134217728,",\n"}}] The only thing I could think of that would vary as the code ran is the memory allocation. From lookoing at the C code, I could see that bignums need heap allocation (a macro called HAlloc). If there was something in the to_integer C code that messed up the process heap (by growing the heap by too small an amount) so that every now and then, data was written a tiny bit past the end of the heap without crashing Erlang, this could account for the behavior. The reason the numbers are always in the same places is because there are heap reallocations as the heap grows, and if the bug is consistent, it will always screw up in the same places. To test this theory, I started Erlang with the +h flag that sets the heap for each process to an initial size, to see if delaying heap allocation fixed the problem. It did fix the problem, sort of, well, it postponed it, as you will see. G:\misc>erl +h 10000 Eshell V5.6 (abort with ^G) 1> nfail2:test(10000). 0 bad conversions: [] 0 2> nfail2:test(100000). 2 bad conversions: [{5463,{134217728,",\n"}},{8964,{134217728,",\n"}}] 2 Notice that the places in which the error occurs are now different. Hope this helps. 2008/9/1 Lev Walkin > > Hi, > > > we all love string to integer conversion routines, such as > string:to_integer/1 or erlang:list_to_integer/1. > > The functions serve us well and indeed provide us with advertised > functionality almost every time. > > However, we've noticed some oddity on our production system which, > after two weeks of jaw-dropping musings, has boiled down to a strange > idempotence violation in the string:to_integer/1 implementation. > > To those of you impatient enough, please look into the code > attached and try running it as nfail:test(10000) and go down from > there. > > Here's that string:to_integer/1 function, having the following > signature: > > string:to_integer(String) -> {Int,Rest} | {error,Reason} > > Here's a typical invocation resulting in a number and the > rest of the unparsed string returned: > > 36> string:to_integer("134217728,\n"). > {134217728,",\n"} > 37> > > What would happen if we did it once more? > > 37> string:to_integer("134217728,\n"). > {134217728,",\n"} > 38> > > By this time we can be reasonably sure that string:to_integer/1 will > return the same output given the same input. We have seen that this > is indeed the case by testing it twice. Could it be that testing > it N times would result in a bad behavior? Nah, unlikely, you say. > > If you haven't looked at the attached code, it is time to do so. > In the code, we create a list of N results of the string:to_integer/1 > application, like this: > > iterate(0, Acc) -> Acc; > iterate(N, Acc) -> > iterate(N - 1, > [case string:to_integer("134217728,\n") of > {Int, _} -> Int > end | Acc]). > > This code utilizes tail recursion with an accumulator list > which gets prepended N times by string:to_integer/1 output, > undoubtedly an integer. (Implementation with a map over lists:seq() > output can do as well). > > So we spawn and run this iterate/2 function, appropriately checking > that the list consists only of integers with value 134217728: > > test(N) -> > {Self, Ref} = {self(), make_ref()}, > spawn_opt(fun()-> > L = iterate(N, []), > % Filter out non-conforming entries > BadList = [X || X <- L, X =/= 134217728], > BadLen = length(BadList), > % Here, BadLen should always be 0! > io:format("~b bad conversions: ~p~n", [BadLen, BadList]), > Self ! {done, Ref, BadLen} > end, > [link,{fullsweep_after,0}]), > receive {done, Ref, Len} -> Len end. > > Based on the smoke tests above, this code should always result > in something like this: > > 38> nfail:test(100). > 0 bad conversions: [] > 0 > 39> > > But let's start testing it thorougly, as if not believing ourselves > that such a simple function could possibly fail at times: > > [vlm@REDACTED:~]> erl > Erlang (BEAM) emulator version 5.6.3 [source] [async-threads:0] > [kernel-poll:false] > > Eshell V5.6.3 (abort with ^G) > 1> c(nfail). > {ok,nfail} > 2> nfail:test(1). > 0 bad conversions: [] > 0 > 3> nfail:test(2). > 0 bad conversions: [] > 0 > 4> nfail:test(100). > 0 bad conversions: [] > 0 > 5> nfail:test(1000). > 2 bad conversions: [{134217728,",\n"},{134217728,",\n"}] > 2 > 6> nfail:test(10000). > 5 bad conversions: [{134217728,",\n"}, > {134217728,",\n"}, > {134217728,",\n"}, > {134217728,",\n"}, > {134217728,",\n"}] > 5 > 7> > ... > 32> nfail:test(810). > 0 bad conversions: [] > 0 > 33> nfail:test(811). > 2 bad conversions: [{134217728,",\n"},{134217728,",\n"}] > 2 > 34> nfail:test(810). > 0 bad conversions: [] > 0 > 35> nfail:test(811). > 2 bad conversions: [{134217728,",\n"},{134217728,",\n"}] > 2 > 36> > > > As you see, string:to_integer/1 consistently generates bad entries > which contain {134217728,",\n"} instead of 134217728, but only > when the number of invocation reaches about 1000 (811 in this > particular sequence of steps). > > Perhaps this is a platform glitch? The above code was being executed > on a ppc (G4) Mac OS X 10.5 with R12B-3 (32-bit) built from scratch. Here's > what Sun sparc v9 with Solaris 10 thinks about that test case: > > [vlm@REDACTED:~]> erl > Erlang (BEAM) emulator version 5.6.3 [source] [async-threads:0] > [hipe] [kernel-poll:false] > > Eshell V5.6.3 (abort with ^G) > 1> c(nfail). > {ok,nfail} > 2> nfail:test(100). > 1 bad conversions: [{134217728,",\n"}] > 1 > 3> nfail:test(1000). > 3 bad conversions: > [{134217728,",\n"},{134217728,",\n"},{134217728,",\n"}] > 3 > 4> nfail:test(10). > 0 bad conversions: [] > 0 > 5> nfail:test(50). > 0 bad conversions: [] > 0 > 6> c(nfail, [native]). > {ok,nfail} > 7> nfail:test(500). > 4 bad conversions: [{134217728,",\n"}, > {134217728,",\n"}, > {134217728,",\n"}, > {134217728,",\n"}] > 4 > 8> > > And here's what 64-bit Intel box (am64, FreeBSD 6.3) thinks: > > [vlm@REDACTED ~]$ erl > Erlang (BEAM) emulator version 5.6.3 [source] [64-bit] > [async-threads:0] [hipe] [kernel-poll:false] > > Eshell V5.6.3 (abort with ^G) > 1> c(nfail). > {ok,nfail} > 2> nfail:test(100). > 0 bad conversions: [] > 0 > 3> nfail:test(1000). > 0 bad conversions: [] > 0 > 4> nfail:test(10000). > 0 bad conversions: [] > 0 > 5> nfail:test(100000). > 0 bad conversions: [] > 0 > 6> nfail:test(1000000). > 0 bad conversions: [] > 0 > 7> > > Oops, it went very well, suprisingly. Perhaps, it is a purely > non-Intel chip problem? Let's try it on a non 64bit machine, > such as Pentium D under Microsoft Windows Vista? in 32-bit mode: > > Erlang (BEAM) emulator version 5.6.3 [smp:2] [async-threads:0] > > Eshell V5.6.3 (abort with ^G) > 1> c('c:/tmp/nfail.erl'). > {ok,nfail} > 2> nfail:test(100). > 0 bad conversions: [] > 0 > 3> nfail:test(1000). > 2 bad conversions: [{134217728,",\n"},{134217728,",\n"}] > 2 > 4> > > Aha! See the pattern: 32-bit Erlang installations on many hardware > platforms are having very similar problems. They are unable to > consistently convert a string containing an integer into an integer > value. Incidentally enough, the integer value for which Erlang > starts to misbehave is 134217728, which is 2^27. Trying it with > value "134217727" or lower does not create this idempotence problem. > Trying R11B-5 under Windows Vista? 32-bit does not exhibit such > problem either, so it must be a specific issue to R12. > > Please advise. > > > P.S. Thanks to Denis Titoruk and Vladimir Serov for investigating > this issue and coming up with a short test case. > > -- > Lev Walkin > vlm@REDACTED > > -module(nfail). > -export([test/1]). > > iterate(0, Acc) -> Acc; > iterate(N, Acc) -> > iterate(N - 1, > [case string:to_integer("134217728,\n") of > {Int, _} -> Int > end | Acc]). > > test(N) -> > {Self, Ref} = {self(), make_ref()}, > spawn_opt(fun()-> > L = iterate(N, []), > BadList = [X || X <- L, X =/= 134217728], > BadLen = length(BadList), > io:format("~b bad conversions: ~p~n", [BadLen, BadList]), > Self ! {done, Ref, BadLen} > end, > [link,{fullsweep_after,0}]), > receive {done, Ref, Len} -> Len end. > > _______________________________________________ > erlang-bugs mailing list > erlang-bugs@REDACTED > http://www.erlang.org/mailman/listinfo/erlang-bugs > -- For every expert there is an equal and opposite expert - Arthur C. Clarke -------------- next part -------------- An HTML attachment was scrubbed... URL: From vlm@REDACTED Tue Sep 2 09:21:13 2008 From: vlm@REDACTED (Lev Walkin) Date: Tue, 02 Sep 2008 00:21:13 -0700 Subject: [erlang-bugs] R12B-3: string:to_integer() sporadic failures In-Reply-To: <6c2563b20809012307s3d366e5h2bcb2b65998a7fbd@mail.gmail.com> References: <48BCB47B.3040503@lionet.info> <6c2563b20809012307s3d366e5h2bcb2b65998a7fbd@mail.gmail.com> Message-ID: <48BCE969.60304@lionet.info> Denis just gave me an additional piece of information. It seems that this heap rehashing happens during a context switch. For instance, when adding a debugging code like this io:format("ProcInfo=[~p]~n",[process_info(self(),reductions)]), it shows that the number of reductions before and after the faulty string:to_integer/1 is, respectively, below and above 1000 reductions. This points to an intervening context switch, since, according to documentation, a forced context switch happens every 1000 reductions. In the Beam emulator, the reduction counter is normally incremented by one for each function and BIF call, and a context switch is forced when the counter reaches 1000. Source: erl -man erlang We have not been able to see whether the subsequent fault happens at the next (N mod 1000) reductions mark though, but it is likely. Edwin Fine wrote: > You did is a great piece of detective work. This really piqued my > interest and I did a bit of playing around. > > It seems that the failures always occur at the same point in the > sequence. What I did is add a sequence number to each conversion, then > print them out. I conjectured that the failed values are actually all > integers but internally incorrectly represented so that the inequality > test fails. I also looked at the C code for the BIF for > string:to_integer, which was ... interesting. > > The bottom line is that every time I ran the test code, the conversions > always failed at the same points (ran on Windows XP SP3, Erlang R12B-3). > I tried varying the erl parameters (turning off SMP, fiddling with > memory allocation, etc) but to no avail. > > 46> nfail2:test(1000000). > 6 bad conversions: [{308,{134217728,",\n"}}, > {500,{134217728,",\n"}}, > {1313,{134217728,",\n"}}, > {3589,{134217728,",\n"}}, > {5631,{134217728,",\n"}}, > {61947,{134217728,",\n"}}] > > The only thing I could think of that would vary as the code ran is the > memory allocation. From lookoing at the C code, I could see that bignums > need heap allocation (a macro called HAlloc). If there was something in > the to_integer C code that messed up the process heap (by growing the > heap by too small an amount) so that every now and then, data was > written a tiny bit past the end of the heap without crashing Erlang, > this could account for the behavior. The reason the numbers are always > in the same places is because there are heap reallocations as the heap > grows, and if the bug is consistent, it will always screw up in the same > places. > > To test this theory, I started Erlang with the +h flag that sets the > heap for each process to an initial size, to see if delaying heap > allocation fixed the problem. > > It did fix the problem, sort of, well, it postponed it, as you will see. > > G:\misc>erl +h 10000 > Eshell V5.6 (abort with ^G) > 1> nfail2:test(10000). > 0 bad conversions: [] > 0 > 2> nfail2:test(100000). > 2 bad conversions: [{5463,{134217728,",\n"}},{8964,{134217728,",\n"}}] > 2 > > Notice that the places in which the error occurs are now different. > > Hope this helps. > > > 2008/9/1 Lev Walkin > > > > Hi, > > > we all love string to integer conversion routines, such as > string:to_integer/1 or erlang:list_to_integer/1. > > The functions serve us well and indeed provide us with advertised > functionality almost every time. > > However, we've noticed some oddity on our production system which, > after two weeks of jaw-dropping musings, has boiled down to a strange > idempotence violation in the string:to_integer/1 implementation. > > To those of you impatient enough, please look into the code > attached and try running it as nfail:test(10000) and go down from > there. > > Here's that string:to_integer/1 function, having the following > signature: > > string:to_integer(String) -> {Int,Rest} | {error,Reason} > > Here's a typical invocation resulting in a number and the > rest of the unparsed string returned: > > 36> string:to_integer("134217728,\n"). > {134217728,",\n"} > 37> > > What would happen if we did it once more? > > 37> string:to_integer("134217728,\n"). > {134217728,",\n"} > 38> > > By this time we can be reasonably sure that string:to_integer/1 will > return the same output given the same input. We have seen that this > is indeed the case by testing it twice. Could it be that testing > it N times would result in a bad behavior? Nah, unlikely, you say. > > If you haven't looked at the attached code, it is time to do so. > In the code, we create a list of N results of the string:to_integer/1 > application, like this: > > iterate(0, Acc) -> Acc; > iterate(N, Acc) -> > iterate(N - 1, > [case string:to_integer("134217728,\n") of > {Int, _} -> Int > end | Acc]). > > This code utilizes tail recursion with an accumulator list > which gets prepended N times by string:to_integer/1 output, > undoubtedly an integer. (Implementation with a map over lists:seq() > output can do as well). > > So we spawn and run this iterate/2 function, appropriately checking > that the list consists only of integers with value 134217728: > > test(N) -> > {Self, Ref} = {self(), make_ref()}, > spawn_opt(fun()-> > L = iterate(N, []), > % Filter out non-conforming entries > BadList = [X || X <- L, X =/= 134217728], > BadLen = length(BadList), > % Here, BadLen should always be 0! > io:format("~b bad conversions: ~p~n", [BadLen, BadList]), > Self ! {done, Ref, BadLen} > end, > [link,{fullsweep_after,0}]), > receive {done, Ref, Len} -> Len end. > > Based on the smoke tests above, this code should always result > in something like this: > > 38> nfail:test(100). > 0 bad conversions: [] > 0 > 39> > > But let's start testing it thorougly, as if not believing ourselves > that such a simple function could possibly fail at times: > > [vlm@REDACTED:~]> erl > Erlang (BEAM) emulator version 5.6.3 [source] > [async-threads:0] [kernel-poll:false] > > Eshell V5.6.3 (abort with ^G) > 1> c(nfail). > {ok,nfail} > 2> nfail:test(1). > 0 bad conversions: [] > 0 > 3> nfail:test(2). > 0 bad conversions: [] > 0 > 4> nfail:test(100). > 0 bad conversions: [] > 0 > 5> nfail:test(1000). > 2 bad conversions: [{134217728,",\n"},{134217728,",\n"}] > 2 > 6> nfail:test(10000). > 5 bad conversions: [{134217728,",\n"}, > {134217728,",\n"}, > {134217728,",\n"}, > {134217728,",\n"}, > {134217728,",\n"}] > 5 > 7> > ... > 32> nfail:test(810). > 0 bad conversions: [] > 0 > 33> nfail:test(811). > 2 bad conversions: [{134217728,",\n"},{134217728,",\n"}] > 2 > 34> nfail:test(810). > 0 bad conversions: [] > 0 > 35> nfail:test(811). > 2 bad conversions: [{134217728,",\n"},{134217728,",\n"}] > 2 > 36> > > > As you see, string:to_integer/1 consistently generates bad entries > which contain {134217728,",\n"} instead of 134217728, but only > when the number of invocation reaches about 1000 (811 in this > particular sequence of steps). > > Perhaps this is a platform glitch? The above code was being executed > on a ppc (G4) Mac OS X 10.5 with R12B-3 (32-bit) built from scratch. > Here's what Sun sparc v9 with Solaris 10 thinks about that test case: > > [vlm@REDACTED:~]> erl > Erlang (BEAM) emulator version 5.6.3 [source] > [async-threads:0] [hipe] [kernel-poll:false] > > Eshell V5.6.3 (abort with ^G) > 1> c(nfail). > {ok,nfail} > 2> nfail:test(100). > 1 bad conversions: [{134217728,",\n"}] > 1 > 3> nfail:test(1000). > 3 bad conversions: > [{134217728,",\n"},{134217728,",\n"},{134217728,",\n"}] > 3 > 4> nfail:test(10). > 0 bad conversions: [] > 0 > 5> nfail:test(50). > 0 bad conversions: [] > 0 > 6> c(nfail, [native]). > {ok,nfail} > 7> nfail:test(500). > 4 bad conversions: [{134217728,",\n"}, > {134217728,",\n"}, > {134217728,",\n"}, > {134217728,",\n"}] > 4 > 8> > > And here's what 64-bit Intel box (am64, FreeBSD 6.3) thinks: > > [vlm@REDACTED ~]$ erl > Erlang (BEAM) emulator version 5.6.3 [source] [64-bit] > [async-threads:0] [hipe] [kernel-poll:false] > > Eshell V5.6.3 (abort with ^G) > 1> c(nfail). > {ok,nfail} > 2> nfail:test(100). > 0 bad conversions: [] > 0 > 3> nfail:test(1000). > 0 bad conversions: [] > 0 > 4> nfail:test(10000). > 0 bad conversions: [] > 0 > 5> nfail:test(100000). > 0 bad conversions: [] > 0 > 6> nfail:test(1000000). > 0 bad conversions: [] > 0 > 7> > > Oops, it went very well, suprisingly. Perhaps, it is a purely > non-Intel chip problem? Let's try it on a non 64bit machine, > such as Pentium D under Microsoft Windows Vista? in 32-bit mode: > > Erlang (BEAM) emulator version 5.6.3 [smp:2] [async-threads:0] > > Eshell V5.6.3 (abort with ^G) > 1> c('c:/tmp/nfail.erl'). > {ok,nfail} > 2> nfail:test(100). > 0 bad conversions: [] > 0 > 3> nfail:test(1000). > 2 bad conversions: [{134217728,",\n"},{134217728,",\n"}] > 2 > 4> > > Aha! See the pattern: 32-bit Erlang installations on many hardware > platforms are having very similar problems. They are unable to > consistently convert a string containing an integer into an integer > value. Incidentally enough, the integer value for which Erlang > starts to misbehave is 134217728, which is 2^27. Trying it with > value "134217727" or lower does not create this idempotence problem. > Trying R11B-5 under Windows Vista? 32-bit does not exhibit such > problem either, so it must be a specific issue to R12. > > Please advise. > > > P.S. Thanks to Denis Titoruk and Vladimir Serov for investigating > this issue and coming up with a short test case. > > -- > Lev Walkin > vlm@REDACTED > > -module(nfail). > -export([test/1]). > > iterate(0, Acc) -> Acc; > iterate(N, Acc) -> > iterate(N - 1, > [case string:to_integer("134217728,\n") of > {Int, _} -> Int > end | Acc]). > > test(N) -> > {Self, Ref} = {self(), make_ref()}, > spawn_opt(fun()-> > L = iterate(N, []), > BadList = [X || X <- L, X =/= 134217728], > BadLen = length(BadList), > io:format("~b bad conversions: ~p~n", [BadLen, BadList]), > Self ! {done, Ref, BadLen} > end, > [link,{fullsweep_after,0}]), > receive {done, Ref, Len} -> Len end. > > _______________________________________________ > erlang-bugs mailing list > erlang-bugs@REDACTED > http://www.erlang.org/mailman/listinfo/erlang-bugs > > > > > -- > For every expert there is an equal and opposite expert - Arthur C. Clarke > > > ------------------------------------------------------------------------ > > _______________________________________________ > erlang-bugs mailing list > erlang-bugs@REDACTED > http://www.erlang.org/mailman/listinfo/erlang-bugs From dgud@REDACTED Tue Sep 2 09:32:02 2008 From: dgud@REDACTED (Dan Gudmundsson) Date: Tue, 02 Sep 2008 09:32:02 +0200 Subject: [erlang-bugs] R12B-3: string:to_integer() sporadic failures In-Reply-To: <48BCE969.60304@lionet.info> References: <48BCB47B.3040503@lionet.info> <6c2563b20809012307s3d366e5h2bcb2b65998a7fbd@mail.gmail.com> <48BCE969.60304@lionet.info> Message-ID: <48BCEBF2.5040700@erix.ericsson.se> We have found the error and a fix will be included the next release. /Dan Lev Walkin wrote: > Denis just gave me an additional piece of information. It seems > that this heap rehashing happens during a context switch. > For instance, when adding a debugging code like this > > io:format("ProcInfo=[~p]~n",[process_info(self(),reductions)]), > > it shows that the number of reductions before and after the faulty > string:to_integer/1 is, respectively, below and above 1000 reductions. > > This points to an intervening context switch, since, according to > documentation, a forced context switch happens every 1000 reductions. > > In the Beam emulator, the reduction counter is normally > incremented by one for each function and BIF call, and a context > switch is forced when the counter reaches 1000. > Source: erl -man erlang > > We have not been able to see whether the subsequent fault happens > at the next (N mod 1000) reductions mark though, but it is likely. > > > Edwin Fine wrote: >> You did is a great piece of detective work. This really piqued my >> interest and I did a bit of playing around. >> >> It seems that the failures always occur at the same point in the >> sequence. What I did is add a sequence number to each conversion, then >> print them out. I conjectured that the failed values are actually all >> integers but internally incorrectly represented so that the inequality >> test fails. I also looked at the C code for the BIF for >> string:to_integer, which was ... interesting. >> >> The bottom line is that every time I ran the test code, the conversions >> always failed at the same points (ran on Windows XP SP3, Erlang R12B-3). >> I tried varying the erl parameters (turning off SMP, fiddling with >> memory allocation, etc) but to no avail. >> >> 46> nfail2:test(1000000). >> 6 bad conversions: [{308,{134217728,",\n"}}, >> {500,{134217728,",\n"}}, >> {1313,{134217728,",\n"}}, >> {3589,{134217728,",\n"}}, >> {5631,{134217728,",\n"}}, >> {61947,{134217728,",\n"}}] >> >> The only thing I could think of that would vary as the code ran is the >> memory allocation. From lookoing at the C code, I could see that bignums >> need heap allocation (a macro called HAlloc). If there was something in >> the to_integer C code that messed up the process heap (by growing the >> heap by too small an amount) so that every now and then, data was >> written a tiny bit past the end of the heap without crashing Erlang, >> this could account for the behavior. The reason the numbers are always >> in the same places is because there are heap reallocations as the heap >> grows, and if the bug is consistent, it will always screw up in the same >> places. >> >> To test this theory, I started Erlang with the +h flag that sets the >> heap for each process to an initial size, to see if delaying heap >> allocation fixed the problem. >> >> It did fix the problem, sort of, well, it postponed it, as you will see. >> >> G:\misc>erl +h 10000 >> Eshell V5.6 (abort with ^G) >> 1> nfail2:test(10000). >> 0 bad conversions: [] >> 0 >> 2> nfail2:test(100000). >> 2 bad conversions: [{5463,{134217728,",\n"}},{8964,{134217728,",\n"}}] >> 2 >> >> Notice that the places in which the error occurs are now different. >> >> Hope this helps. >> >> >> 2008/9/1 Lev Walkin > >> >> >> Hi, >> >> >> we all love string to integer conversion routines, such as >> string:to_integer/1 or erlang:list_to_integer/1. >> >> The functions serve us well and indeed provide us with advertised >> functionality almost every time. >> >> However, we've noticed some oddity on our production system which, >> after two weeks of jaw-dropping musings, has boiled down to a strange >> idempotence violation in the string:to_integer/1 implementation. >> >> To those of you impatient enough, please look into the code >> attached and try running it as nfail:test(10000) and go down from >> there. >> >> Here's that string:to_integer/1 function, having the following >> signature: >> >> string:to_integer(String) -> {Int,Rest} | {error,Reason} >> >> Here's a typical invocation resulting in a number and the >> rest of the unparsed string returned: >> >> 36> string:to_integer("134217728,\n"). >> {134217728,",\n"} >> 37> >> >> What would happen if we did it once more? >> >> 37> string:to_integer("134217728,\n"). >> {134217728,",\n"} >> 38> >> >> By this time we can be reasonably sure that string:to_integer/1 will >> return the same output given the same input. We have seen that this >> is indeed the case by testing it twice. Could it be that testing >> it N times would result in a bad behavior? Nah, unlikely, you say. >> >> If you haven't looked at the attached code, it is time to do so. >> In the code, we create a list of N results of the string:to_integer/1 >> application, like this: >> >> iterate(0, Acc) -> Acc; >> iterate(N, Acc) -> >> iterate(N - 1, >> [case string:to_integer("134217728,\n") of >> {Int, _} -> Int >> end | Acc]). >> >> This code utilizes tail recursion with an accumulator list >> which gets prepended N times by string:to_integer/1 output, >> undoubtedly an integer. (Implementation with a map over lists:seq() >> output can do as well). >> >> So we spawn and run this iterate/2 function, appropriately checking >> that the list consists only of integers with value 134217728: >> >> test(N) -> >> {Self, Ref} = {self(), make_ref()}, >> spawn_opt(fun()-> >> L = iterate(N, []), >> % Filter out non-conforming entries >> BadList = [X || X <- L, X =/= 134217728], >> BadLen = length(BadList), >> % Here, BadLen should always be 0! >> io:format("~b bad conversions: ~p~n", [BadLen, BadList]), >> Self ! {done, Ref, BadLen} >> end, >> [link,{fullsweep_after,0}]), >> receive {done, Ref, Len} -> Len end. >> >> Based on the smoke tests above, this code should always result >> in something like this: >> >> 38> nfail:test(100). >> 0 bad conversions: [] >> 0 >> 39> >> >> But let's start testing it thorougly, as if not believing ourselves >> that such a simple function could possibly fail at times: >> >> [vlm@REDACTED:~]> erl >> Erlang (BEAM) emulator version 5.6.3 [source] >> [async-threads:0] [kernel-poll:false] >> >> Eshell V5.6.3 (abort with ^G) >> 1> c(nfail). >> {ok,nfail} >> 2> nfail:test(1). >> 0 bad conversions: [] >> 0 >> 3> nfail:test(2). >> 0 bad conversions: [] >> 0 >> 4> nfail:test(100). >> 0 bad conversions: [] >> 0 >> 5> nfail:test(1000). >> 2 bad conversions: [{134217728,",\n"},{134217728,",\n"}] >> 2 >> 6> nfail:test(10000). >> 5 bad conversions: [{134217728,",\n"}, >> {134217728,",\n"}, >> {134217728,",\n"}, >> {134217728,",\n"}, >> {134217728,",\n"}] >> 5 >> 7> >> ... >> 32> nfail:test(810). >> 0 bad conversions: [] >> 0 >> 33> nfail:test(811). >> 2 bad conversions: [{134217728,",\n"},{134217728,",\n"}] >> 2 >> 34> nfail:test(810). >> 0 bad conversions: [] >> 0 >> 35> nfail:test(811). >> 2 bad conversions: [{134217728,",\n"},{134217728,",\n"}] >> 2 >> 36> >> >> >> As you see, string:to_integer/1 consistently generates bad entries >> which contain {134217728,",\n"} instead of 134217728, but only >> when the number of invocation reaches about 1000 (811 in this >> particular sequence of steps). >> >> Perhaps this is a platform glitch? The above code was being executed >> on a ppc (G4) Mac OS X 10.5 with R12B-3 (32-bit) built from scratch. >> Here's what Sun sparc v9 with Solaris 10 thinks about that test case: >> >> [vlm@REDACTED:~]> erl >> Erlang (BEAM) emulator version 5.6.3 [source] >> [async-threads:0] [hipe] [kernel-poll:false] >> >> Eshell V5.6.3 (abort with ^G) >> 1> c(nfail). >> {ok,nfail} >> 2> nfail:test(100). >> 1 bad conversions: [{134217728,",\n"}] >> 1 >> 3> nfail:test(1000). >> 3 bad conversions: >> [{134217728,",\n"},{134217728,",\n"},{134217728,",\n"}] >> 3 >> 4> nfail:test(10). >> 0 bad conversions: [] >> 0 >> 5> nfail:test(50). >> 0 bad conversions: [] >> 0 >> 6> c(nfail, [native]). >> {ok,nfail} >> 7> nfail:test(500). >> 4 bad conversions: [{134217728,",\n"}, >> {134217728,",\n"}, >> {134217728,",\n"}, >> {134217728,",\n"}] >> 4 >> 8> >> >> And here's what 64-bit Intel box (am64, FreeBSD 6.3) thinks: >> >> [vlm@REDACTED ~]$ erl >> Erlang (BEAM) emulator version 5.6.3 [source] [64-bit] >> [async-threads:0] [hipe] [kernel-poll:false] >> >> Eshell V5.6.3 (abort with ^G) >> 1> c(nfail). >> {ok,nfail} >> 2> nfail:test(100). >> 0 bad conversions: [] >> 0 >> 3> nfail:test(1000). >> 0 bad conversions: [] >> 0 >> 4> nfail:test(10000). >> 0 bad conversions: [] >> 0 >> 5> nfail:test(100000). >> 0 bad conversions: [] >> 0 >> 6> nfail:test(1000000). >> 0 bad conversions: [] >> 0 >> 7> >> >> Oops, it went very well, suprisingly. Perhaps, it is a purely >> non-Intel chip problem? Let's try it on a non 64bit machine, >> such as Pentium D under Microsoft Windows Vista? in 32-bit mode: >> >> Erlang (BEAM) emulator version 5.6.3 [smp:2] [async-threads:0] >> >> Eshell V5.6.3 (abort with ^G) >> 1> c('c:/tmp/nfail.erl'). >> {ok,nfail} >> 2> nfail:test(100). >> 0 bad conversions: [] >> 0 >> 3> nfail:test(1000). >> 2 bad conversions: [{134217728,",\n"},{134217728,",\n"}] >> 2 >> 4> >> >> Aha! See the pattern: 32-bit Erlang installations on many hardware >> platforms are having very similar problems. They are unable to >> consistently convert a string containing an integer into an integer >> value. Incidentally enough, the integer value for which Erlang >> starts to misbehave is 134217728, which is 2^27. Trying it with >> value "134217727" or lower does not create this idempotence problem. >> Trying R11B-5 under Windows Vista? 32-bit does not exhibit such >> problem either, so it must be a specific issue to R12. >> >> Please advise. >> >> >> P.S. Thanks to Denis Titoruk and Vladimir Serov for investigating >> this issue and coming up with a short test case. >> >> -- >> Lev Walkin >> vlm@REDACTED >> >> -module(nfail). >> -export([test/1]). >> >> iterate(0, Acc) -> Acc; >> iterate(N, Acc) -> >> iterate(N - 1, >> [case string:to_integer("134217728,\n") of >> {Int, _} -> Int >> end | Acc]). >> >> test(N) -> >> {Self, Ref} = {self(), make_ref()}, >> spawn_opt(fun()-> >> L = iterate(N, []), >> BadList = [X || X <- L, X =/= 134217728], >> BadLen = length(BadList), >> io:format("~b bad conversions: ~p~n", [BadLen, BadList]), >> Self ! {done, Ref, BadLen} >> end, >> [link,{fullsweep_after,0}]), >> receive {done, Ref, Len} -> Len end. >> >> _______________________________________________ >> erlang-bugs mailing list >> erlang-bugs@REDACTED >> http://www.erlang.org/mailman/listinfo/erlang-bugs >> >> >> >> >> -- >> For every expert there is an equal and opposite expert - Arthur C. Clarke >> >> >> ------------------------------------------------------------------------ >> >> _______________________________________________ >> erlang-bugs mailing list >> erlang-bugs@REDACTED >> http://www.erlang.org/mailman/listinfo/erlang-bugs > > _______________________________________________ > erlang-bugs mailing list > erlang-bugs@REDACTED > http://www.erlang.org/mailman/listinfo/erlang-bugs > From erlang-questions_efine@REDACTED Tue Sep 2 10:23:59 2008 From: erlang-questions_efine@REDACTED (Edwin Fine) Date: Tue, 2 Sep 2008 04:23:59 -0400 Subject: [erlang-bugs] R12B-3: string:to_integer() sporadic failures In-Reply-To: <6c2563b20809020118n5af97910y491abf6d852d4c0e@mail.gmail.com> References: <48BCB47B.3040503@lionet.info> <6c2563b20809012307s3d366e5h2bcb2b65998a7fbd@mail.gmail.com> <48BCE969.60304@lionet.info> <48BCEBF2.5040700@erix.ericsson.se> <6c2563b20809020118n5af97910y491abf6d852d4c0e@mail.gmail.com> Message-ID: <6c2563b20809020123n734ca12aq7adf9861b72ed640@mail.gmail.com> Thanks, Dan. As a matter of interest, what was the error (executive summary:) ? Regards, Edwin -------------- next part -------------- An HTML attachment was scrubbed... URL: From raimo+erlang-bugs@REDACTED Tue Sep 2 11:40:55 2008 From: raimo+erlang-bugs@REDACTED (Raimo Niskanen) Date: Tue, 2 Sep 2008 11:40:55 +0200 Subject: [erlang-bugs] : R12B-3: string:to_integer() sporadic failures In-Reply-To: <6c2563b20809020123n734ca12aq7adf9861b72ed640@mail.gmail.com> References: <48BCB47B.3040503@lionet.info> <6c2563b20809012307s3d366e5h2bcb2b65998a7fbd@mail.gmail.com> <48BCE969.60304@lionet.info> <48BCEBF2.5040700@erix.ericsson.se> <6c2563b20809020118n5af97910y491abf6d852d4c0e@mail.gmail.com> <6c2563b20809020123n734ca12aq7adf9861b72ed640@mail.gmail.com> Message-ID: <20080902094055.GA5845@erix.ericsson.se> On Tue, Sep 02, 2008 at 04:23:59AM -0400, Edwin Fine wrote: > Thanks, Dan. As a matter of interest, what was the error (executive > summary:) ? When the result became a bignum, a second allocation of heap space was made - the first for the result container and the second for the bignum; and these were in the wrong order temporal vs pointerwise violating an essential garbage collector assumption. So, when the second allocation happened to allocate a temporary heap buffer aka "heap fragment", a subsequent garbage collect messed up ending with the result pointing to invalid data in freed memory... This could have lead to almost anything including emulator crash. > Regards, > Edwin > _______________________________________________ > erlang-bugs mailing list > erlang-bugs@REDACTED > http://www.erlang.org/mailman/listinfo/erlang-bugs -- / Raimo Niskanen, Erlang/OTP, Ericsson AB From erlang-questions_efine@REDACTED Tue Sep 2 11:49:09 2008 From: erlang-questions_efine@REDACTED (Edwin Fine) Date: Tue, 2 Sep 2008 05:49:09 -0400 Subject: [erlang-bugs] : R12B-3: string:to_integer() sporadic failures In-Reply-To: <20080902094055.GA5845@erix.ericsson.se> References: <48BCB47B.3040503@lionet.info> <6c2563b20809012307s3d366e5h2bcb2b65998a7fbd@mail.gmail.com> <48BCE969.60304@lionet.info> <48BCEBF2.5040700@erix.ericsson.se> <6c2563b20809020118n5af97910y491abf6d852d4c0e@mail.gmail.com> <6c2563b20809020123n734ca12aq7adf9861b72ed640@mail.gmail.com> <20080902094055.GA5845@erix.ericsson.se> Message-ID: <6c2563b20809020249p1a1810e8rc667927234606ea9@mail.gmail.com> Wow. I'm really glad Lev and co. noticed it, then! Thanks. On Tue, Sep 2, 2008 at 5:40 AM, Raimo Niskanen < raimo+erlang-bugs@REDACTED >wrote: > On Tue, Sep 02, 2008 at 04:23:59AM -0400, Edwin Fine wrote: > > Thanks, Dan. As a matter of interest, what was the error (executive > > summary:) ? > > When the result became a bignum, a second allocation of heap > space was made - the first for the result container and the > second for the bignum; and these were in the wrong order > temporal vs pointerwise violating an essential garbage > collector assumption. > > So, when the second allocation happened to allocate > a temporary heap buffer aka "heap fragment", a > subsequent garbage collect messed up ending with > the result pointing to invalid data in freed memory... > > This could have lead to almost anything including > emulator crash. > > > Regards, > > Edwin > > > _______________________________________________ > > erlang-bugs mailing list > > erlang-bugs@REDACTED > > http://www.erlang.org/mailman/listinfo/erlang-bugs > > -- > > / Raimo Niskanen, Erlang/OTP, Ericsson AB > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bgreence@REDACTED Tue Sep 2 19:01:17 2008 From: bgreence@REDACTED (bruce green) Date: Tue, 2 Sep 2008 23:01:17 +0600 Subject: [erlang-bugs] asn.1 - decoding corrupted binary Message-ID: There is an infinite loop during corrupted binary decoding. Example: Rec3 DEFINITIONS IMPLICIT TAGS ::= BEGIN EXPORTS Rec3; Rec3 ::= SET { recType [0] RecType, typedItem [1] TypedItem OPTIONAL } RecType ::= INTEGER { rec4 (0), rec5 (1) } TypedItem ::= OCTET STRING (SIZE(1..20)) END the record: #'Rec3'{recType=rec5,typedItem=[16#12,16#34,16#56,16#78,16#40,16#90,16#19,16#33]} the encoded binary: 31 0D 80 01 01 81 08 12 34 56 78 40 90 19 33 the modified (=corrupted) binary: 31 0D 80 01 01 00 00 00 00 00 00 00 00 00 00 Now I try to decode the corrupted binary and the program goes to the infinite loop. The suspected code in the generated erl module: 'dec_Rec3_fun'(Bytes, OptOrMand) -> ... %% tag not found, if extensionmark we should skip bytes here _ -> {[], Bytes,0} Tested on: R11B3 (asn1 - 1.4.4.11), R12B3 (asn1 - 1.5.2) From norton@REDACTED Wed Sep 3 04:21:34 2008 From: norton@REDACTED (Joseph Wayne Norton) Date: Wed, 03 Sep 2008 11:21:34 +0900 Subject: [erlang-bugs] R12B-3: string:to_integer() sporadic failures In-Reply-To: <48BCE969.60304@lionet.info> References: <48BCB47B.3040503@lionet.info> <6c2563b20809012307s3d366e5h2bcb2b65998a7fbd@mail.gmail.com> <48BCE969.60304@lionet.info> Message-ID: Hello. Does anyone know if this bug exists in R11B-X? or just R12B-X? thanks, On Tue, 02 Sep 2008 16:21:13 +0900, Lev Walkin wrote: > > Denis just gave me an additional piece of information. It seems > that this heap rehashing happens during a context switch. > For instance, when adding a debugging code like this > > io:format("ProcInfo=[~p]~n",[process_info(self(),reductions)]), > > it shows that the number of reductions before and after the faulty > string:to_integer/1 is, respectively, below and above 1000 reductions. > > This points to an intervening context switch, since, according to > documentation, a forced context switch happens every 1000 reductions. > > In the Beam emulator, the reduction counter is normally > incremented by one for each function and BIF call, and a context > switch is forced when the counter reaches 1000. > Source: erl -man erlang > > We have not been able to see whether the subsequent fault happens > at the next (N mod 1000) reductions mark though, but it is likely. > > > Edwin Fine wrote: >> You did is a great piece of detective work. This really piqued my >> interest and I did a bit of playing around. >> >> It seems that the failures always occur at the same point in the >> sequence. What I did is add a sequence number to each conversion, then >> print them out. I conjectured that the failed values are actually all >> integers but internally incorrectly represented so that the inequality >> test fails. I also looked at the C code for the BIF for >> string:to_integer, which was ... interesting. >> >> The bottom line is that every time I ran the test code, the conversions >> always failed at the same points (ran on Windows XP SP3, Erlang R12B-3). >> I tried varying the erl parameters (turning off SMP, fiddling with >> memory allocation, etc) but to no avail. >> >> 46> nfail2:test(1000000). >> 6 bad conversions: [{308,{134217728,",\n"}}, >> {500,{134217728,",\n"}}, >> {1313,{134217728,",\n"}}, >> {3589,{134217728,",\n"}}, >> {5631,{134217728,",\n"}}, >> {61947,{134217728,",\n"}}] >> >> The only thing I could think of that would vary as the code ran is the >> memory allocation. From lookoing at the C code, I could see that bignums >> need heap allocation (a macro called HAlloc). If there was something in >> the to_integer C code that messed up the process heap (by growing the >> heap by too small an amount) so that every now and then, data was >> written a tiny bit past the end of the heap without crashing Erlang, >> this could account for the behavior. The reason the numbers are always >> in the same places is because there are heap reallocations as the heap >> grows, and if the bug is consistent, it will always screw up in the same >> places. >> >> To test this theory, I started Erlang with the +h flag that sets the >> heap for each process to an initial size, to see if delaying heap >> allocation fixed the problem. >> >> It did fix the problem, sort of, well, it postponed it, as you will see. >> >> G:\misc>erl +h 10000 >> Eshell V5.6 (abort with ^G) >> 1> nfail2:test(10000). >> 0 bad conversions: [] >> 0 >> 2> nfail2:test(100000). >> 2 bad conversions: [{5463,{134217728,",\n"}},{8964,{134217728,",\n"}}] >> 2 >> >> Notice that the places in which the error occurs are now different. >> >> Hope this helps. >> >> >> 2008/9/1 Lev Walkin > >> >> >> Hi, >> >> >> we all love string to integer conversion routines, such as >> string:to_integer/1 or erlang:list_to_integer/1. >> >> The functions serve us well and indeed provide us with advertised >> functionality almost every time. >> >> However, we've noticed some oddity on our production system which, >> after two weeks of jaw-dropping musings, has boiled down to a >> strange >> idempotence violation in the string:to_integer/1 implementation. >> >> To those of you impatient enough, please look into the code >> attached and try running it as nfail:test(10000) and go down from >> there. >> >> Here's that string:to_integer/1 function, having the following >> signature: >> >> string:to_integer(String) -> {Int,Rest} | {error,Reason} >> >> Here's a typical invocation resulting in a number and the >> rest of the unparsed string returned: >> >> 36> string:to_integer("134217728,\n"). >> {134217728,",\n"} >> 37> >> >> What would happen if we did it once more? >> >> 37> string:to_integer("134217728,\n"). >> {134217728,",\n"} >> 38> >> >> By this time we can be reasonably sure that string:to_integer/1 will >> return the same output given the same input. We have seen that this >> is indeed the case by testing it twice. Could it be that testing >> it N times would result in a bad behavior? Nah, unlikely, you say. >> >> If you haven't looked at the attached code, it is time to do so. >> In the code, we create a list of N results of the >> string:to_integer/1 >> application, like this: >> >> iterate(0, Acc) -> Acc; >> iterate(N, Acc) -> >> iterate(N - 1, >> [case string:to_integer("134217728,\n") of >> {Int, _} -> Int >> end | Acc]). >> >> This code utilizes tail recursion with an accumulator list >> which gets prepended N times by string:to_integer/1 output, >> undoubtedly an integer. (Implementation with a map over lists:seq() >> output can do as well). >> >> So we spawn and run this iterate/2 function, appropriately checking >> that the list consists only of integers with value 134217728: >> >> test(N) -> >> {Self, Ref} = {self(), make_ref()}, >> spawn_opt(fun()-> >> L = iterate(N, []), >> % Filter out non-conforming entries >> BadList = [X || X <- L, X =/= 134217728], >> BadLen = length(BadList), >> % Here, BadLen should always be 0! >> io:format("~b bad conversions: ~p~n", [BadLen, >> BadList]), >> Self ! {done, Ref, BadLen} >> end, >> [link,{fullsweep_after,0}]), >> receive {done, Ref, Len} -> Len end. >> >> Based on the smoke tests above, this code should always result >> in something like this: >> >> 38> nfail:test(100). >> 0 bad conversions: [] >> 0 >> 39> >> >> But let's start testing it thorougly, as if not believing ourselves >> that such a simple function could possibly fail at times: >> >> [vlm@REDACTED:~]> erl >> Erlang (BEAM) emulator version 5.6.3 [source] >> [async-threads:0] [kernel-poll:false] >> >> Eshell V5.6.3 (abort with ^G) >> 1> c(nfail). >> {ok,nfail} >> 2> nfail:test(1). >> 0 bad conversions: [] >> 0 >> 3> nfail:test(2). >> 0 bad conversions: [] >> 0 >> 4> nfail:test(100). >> 0 bad conversions: [] >> 0 >> 5> nfail:test(1000). >> 2 bad conversions: [{134217728,",\n"},{134217728,",\n"}] >> 2 >> 6> nfail:test(10000). >> 5 bad conversions: [{134217728,",\n"}, >> {134217728,",\n"}, >> {134217728,",\n"}, >> {134217728,",\n"}, >> {134217728,",\n"}] >> 5 >> 7> >> ... >> 32> nfail:test(810). >> 0 bad conversions: [] >> 0 >> 33> nfail:test(811). >> 2 bad conversions: [{134217728,",\n"},{134217728,",\n"}] >> 2 >> 34> nfail:test(810). >> 0 bad conversions: [] >> 0 >> 35> nfail:test(811). >> 2 bad conversions: [{134217728,",\n"},{134217728,",\n"}] >> 2 >> 36> >> >> >> As you see, string:to_integer/1 consistently generates bad entries >> which contain {134217728,",\n"} instead of 134217728, but only >> when the number of invocation reaches about 1000 (811 in this >> particular sequence of steps). >> >> Perhaps this is a platform glitch? The above code was being executed >> on a ppc (G4) Mac OS X 10.5 with R12B-3 (32-bit) built from scratch. >> Here's what Sun sparc v9 with Solaris 10 thinks about that test >> case: >> >> [vlm@REDACTED:~]> erl >> Erlang (BEAM) emulator version 5.6.3 [source] >> [async-threads:0] [hipe] [kernel-poll:false] >> >> Eshell V5.6.3 (abort with ^G) >> 1> c(nfail). >> {ok,nfail} >> 2> nfail:test(100). >> 1 bad conversions: [{134217728,",\n"}] >> 1 >> 3> nfail:test(1000). >> 3 bad conversions: >> [{134217728,",\n"},{134217728,",\n"},{134217728,",\n"}] >> 3 >> 4> nfail:test(10). >> 0 bad conversions: [] >> 0 >> 5> nfail:test(50). >> 0 bad conversions: [] >> 0 >> 6> c(nfail, [native]). >> {ok,nfail} >> 7> nfail:test(500). >> 4 bad conversions: [{134217728,",\n"}, >> {134217728,",\n"}, >> {134217728,",\n"}, >> {134217728,",\n"}] >> 4 >> 8> >> >> And here's what 64-bit Intel box (am64, FreeBSD 6.3) thinks: >> >> [vlm@REDACTED ~]$ erl >> Erlang (BEAM) emulator version 5.6.3 [source] [64-bit] >> [async-threads:0] [hipe] [kernel-poll:false] >> >> Eshell V5.6.3 (abort with ^G) >> 1> c(nfail). >> {ok,nfail} >> 2> nfail:test(100). >> 0 bad conversions: [] >> 0 >> 3> nfail:test(1000). >> 0 bad conversions: [] >> 0 >> 4> nfail:test(10000). >> 0 bad conversions: [] >> 0 >> 5> nfail:test(100000). >> 0 bad conversions: [] >> 0 >> 6> nfail:test(1000000). >> 0 bad conversions: [] >> 0 >> 7> >> >> Oops, it went very well, suprisingly. Perhaps, it is a purely >> non-Intel chip problem? Let's try it on a non 64bit machine, >> such as Pentium D under Microsoft Windows Vista? in 32-bit mode: >> >> Erlang (BEAM) emulator version 5.6.3 [smp:2] >> [async-threads:0] >> >> Eshell V5.6.3 (abort with ^G) >> 1> c('c:/tmp/nfail.erl'). >> {ok,nfail} >> 2> nfail:test(100). >> 0 bad conversions: [] >> 0 >> 3> nfail:test(1000). >> 2 bad conversions: [{134217728,",\n"},{134217728,",\n"}] >> 2 >> 4> >> >> Aha! See the pattern: 32-bit Erlang installations on many hardware >> platforms are having very similar problems. They are unable to >> consistently convert a string containing an integer into an integer >> value. Incidentally enough, the integer value for which Erlang >> starts to misbehave is 134217728, which is 2^27. Trying it with >> value "134217727" or lower does not create this idempotence problem. >> Trying R11B-5 under Windows Vista? 32-bit does not exhibit such >> problem either, so it must be a specific issue to R12. >> >> Please advise. >> >> >> P.S. Thanks to Denis Titoruk and Vladimir Serov for investigating >> this issue and coming up with a short test case. >> >> -- >> Lev Walkin >> vlm@REDACTED >> >> -module(nfail). >> -export([test/1]). >> >> iterate(0, Acc) -> Acc; >> iterate(N, Acc) -> >> iterate(N - 1, >> [case string:to_integer("134217728,\n") of >> {Int, _} -> Int >> end | Acc]). >> >> test(N) -> >> {Self, Ref} = {self(), make_ref()}, >> spawn_opt(fun()-> >> L = iterate(N, []), >> BadList = [X || X <- L, X =/= 134217728], >> BadLen = length(BadList), >> io:format("~b bad conversions: ~p~n", [BadLen, >> BadList]), >> Self ! {done, Ref, BadLen} >> end, >> [link,{fullsweep_after,0}]), >> receive {done, Ref, Len} -> Len end. >> >> _______________________________________________ >> erlang-bugs mailing list >> erlang-bugs@REDACTED >> http://www.erlang.org/mailman/listinfo/erlang-bugs >> >> >> >> >> -- >> For every expert there is an equal and opposite expert - Arthur C. >> Clarke >> >> >> ------------------------------------------------------------------------ >> >> _______________________________________________ >> erlang-bugs mailing list >> erlang-bugs@REDACTED >> http://www.erlang.org/mailman/listinfo/erlang-bugs > > _______________________________________________ > erlang-bugs mailing list > erlang-bugs@REDACTED > http://www.erlang.org/mailman/listinfo/erlang-bugs -- norton@REDACTED From bertil.karlsson@REDACTED Wed Sep 3 09:00:20 2008 From: bertil.karlsson@REDACTED (Bertil Karlsson) Date: Wed, 03 Sep 2008 09:00:20 +0200 Subject: [erlang-bugs] asn.1 - decoding corrupted binary In-Reply-To: References: Message-ID: <48BE3604.5070802@ericsson.com> Thank you for reporting this behaviour of the asn1 decoder. It is a bug that will be corrected, though we cannot make it to the coming OTP-release. /Bertil Karlsson OTP-Team bruce green wrote: > There is an infinite loop during corrupted binary decoding. > > Example: > > Rec3 DEFINITIONS IMPLICIT TAGS ::= > BEGIN > EXPORTS Rec3; > Rec3 ::= SET > { > recType [0] RecType, > typedItem [1] TypedItem OPTIONAL > } > RecType ::= INTEGER > { > rec4 (0), > rec5 (1) > } > TypedItem ::= OCTET STRING (SIZE(1..20)) > END > > the record: > #'Rec3'{recType=rec5,typedItem=[16#12,16#34,16#56,16#78,16#40,16#90,16#19,16#33]} > > the encoded binary: > 31 0D 80 01 01 81 08 12 34 56 78 40 90 19 33 > > the modified (=corrupted) binary: > 31 0D 80 01 01 00 00 00 00 00 00 00 00 00 00 > > Now I try to decode the corrupted binary and the program goes to the > infinite loop. > > The suspected code in the generated erl module: > 'dec_Rec3_fun'(Bytes, OptOrMand) -> > ... > %% tag not found, if extensionmark we should skip bytes here > _ -> {[], Bytes,0} > > Tested on: R11B3 (asn1 - 1.4.4.11), R12B3 (asn1 - 1.5.2) > _______________________________________________ > erlang-bugs mailing list > erlang-bugs@REDACTED > http://www.erlang.org/mailman/listinfo/erlang-bugs > > From bgustavsson@REDACTED Wed Sep 3 09:00:20 2008 From: bgustavsson@REDACTED (Bjorn Gustavsson) Date: Wed, 3 Sep 2008 09:00:20 +0200 Subject: [erlang-bugs] R12B-3: string:to_integer() sporadic failures In-Reply-To: References: <48BCB47B.3040503@lionet.info> <6c2563b20809012307s3d366e5h2bcb2b65998a7fbd@mail.gmail.com> <48BCE969.60304@lionet.info> Message-ID: <6672d0160809030000w4231ebf1la638346024fdc910@mail.gmail.com> On Wed, Sep 3, 2008 at 4:21 AM, Joseph Wayne Norton wrote: > Hello. > > Does anyone know if this bug exists in R11B-X? or just R12B-X? > Only in R12B-X. /Bjorn > > -- Bj?rn Gustavsson, Erlang/OTP, Ericsson AB -------------- next part -------------- An HTML attachment was scrubbed... URL: From kenji.rikitake@REDACTED Wed Sep 3 11:28:01 2008 From: kenji.rikitake@REDACTED (Kenji Rikitake) Date: Wed, 3 Sep 2008 18:28:01 +0900 Subject: [erlang-bugs] Setting up Erlang R12B3 ssl-3.9 inet_ssl distribution (Re: [erlang-questions] Erlang R12B3 inet_ssl_dist does not work with ssl-3.9) In-Reply-To: <20080824040300.GA78691@k2r.org> References: <20080823081636.GA63811@k2r.org> <20080824040300.GA78691@k2r.org> Message-ID: <20080903092801.GA82392@k2r.org> In the message <20080824040300.GA78691@REDACTED> dated Sun, Aug 24, 2008 at 01:02:37PM +0900, Kenji Rikitake writes: > I have been trying many times to start Erlang SSL distribution on R12B3 > with ssl-3.9, which hasn't been successful. > I'm running Erlang VM on FreeBSD 6.3-RELEASE. I finally found out that setting the client/server key pairs of inet_ssl_dist solved the problem, which was written at: http://www.trapexit.org/forum/viewtopic.php?p=22404#22404 I had to build client/server self-signed keys as written in: http://sial.org/howto/openssl/self-signed/ So the real problems were: * ssl-3.9 manual Chapter 5 does not represent the R12B3 implementation difference. In R12B3: * In creating start_ssl.boot as described in ssl-3.9 manual section 5.2, two warnings remain: 1> systools:make_script("start_ssl",[]). *WARNING* ssl: Source code not found: ssl_pkix_oid.erl *WARNING* ssl: Source code not found: 'OTP-PKIX'.erl ok To suppress the warning messages, creating symbolic links worked: (cd $ERLANG_TOP/lib/ssl-3.9/src; ln -s ../pkix/OTP-PKIX.erl; ln -s ../pkix/ssl_pkix_oid.erl;) * Even after the boot script is built, ssl_server is not registered, which is supposed to be, as described in ssl manual Section 5.2. This is due to ssl-3.9 implementation; to invoke ssl_server, do ssl:version() so that the version number tuple like following returns: {ok,{"3.9","OpenSSL 0.9.7e-p1 25 Oct 2004", "OpenSSL 0.9.7e-p1 25 Oct 2004"}} * The starting sequence written in Section 5.5 is *mandatory* to start up the Erlang Shell with inet_ssl distribution. Specifically, the server_certfile and client_certfile options of -ssl_dist_opt are *required*; otherwise, the shell will not start. A startup script example is: erl -boot /my/dir1/start_ssl \ -proto_dist inet_ssl \ -name a1 \ -ssl_dist_opt server_certfile \ /my/certs/host.pem \ -ssl_dist_opt client_certfile \ /my/certs/host.pem \ -ssl_dist_opt verify 1 depth 1 * Summary: the Section 5 of the manual of ssl-3.9 has to be fixed up-to-date to represent the implementation difference. (Yes, I tcpdump'ed the packets, and the exchange between two inet_ssl hosts were actually encrypted :-)) Regards, Kenji Rikitake From anders.danne@REDACTED Thu Sep 4 16:51:05 2008 From: anders.danne@REDACTED (Anders Danne) Date: Thu, 04 Sep 2008 16:51:05 +0200 Subject: [erlang-bugs] Bugs in relaxed HTTP REDIRECT response Message-ID: <48BFF5D9.9060709@ericsson.com> Hi, I had to fix two things in R12B-4 httpc_response.erl. I sent a HTTP get with {relaxed,true} and got a HTTP REDIRECT back and got a crash. 1) Port is an integer. 2) The Path was duplicated in RedirUrl. That may depend on the server so a more intelligent solution may be needed. Original code: fix_relative_uri(Request, RedirUrl) -> {Server, Port} = Request#request.address, Path = Request#request.path, atom_to_list(Request#request.scheme) ++ "://" ++ Server ++ ":" ++ Port ++ Path ++ RedirUrl. New code: ... atom_to_list(Request#request.scheme) ++ "://" ++ Server ++ ":" ++ integer_to_list(Port) ++ "/" ++ RedirUrl. ///Anders From vladdu55@REDACTED Fri Sep 5 13:37:31 2008 From: vladdu55@REDACTED (Vlad Dumitrescu) Date: Fri, 5 Sep 2008 13:37:31 +0200 Subject: [erlang-bugs] Missing documentation Message-ID: <95be1d3b0809050437r25307776h93928c8d387e1bde@mail.gmail.com> Hi! The R12 documentation is missing documentation for some functions. The ones I noticed are code:is_sticky/1 and is_module_native/1. best regards, Vlad From sverker@REDACTED Fri Sep 5 15:17:04 2008 From: sverker@REDACTED (Sverker Eriksson) Date: Fri, 05 Sep 2008 15:17:04 +0200 Subject: [erlang-bugs] Incomplete detaching from a controlling terminal In-Reply-To: References: Message-ID: <48C13150.5050609@erix.ericsson.se> Sergei Golovan wrote: > Appears that if started with -detached option erlexec doesn't fully > detach from a controlling terminal which may end with killing a > detached service. Thanks! It will be corrected in R12B-5. /Sverker, Erlang/OTP Ericsson From goryachev@REDACTED Fri Sep 5 20:46:09 2008 From: goryachev@REDACTED (Igor Goryachev) Date: Fri, 05 Sep 2008 22:46:09 +0400 Subject: [erlang-bugs] bug: Protocol: "inet_tcp": register/listen error: eaddrinuse while starting a node Message-ID: <871vzyfn2m.fsf@yandex-team.ru> Hello everybody. It's me again. I've posted this message some time ago, but have not got any response. I confirm that this bug is reproduced (seems to be floating bug) when using R12B-3 version and '-kernel inet_dist_listen_min XXXX inet_dist_listen_max YYYY' when XXXX number equal to YYYY. -------------- next part -------------- An embedded message was scrubbed... From: Igor Goryachev Subject: [erlang-bugs] bug: Protocol: "inet_tcp": register/listen error: eaddrinuse while starting a node Date: Tue, 05 Feb 2008 18:00:30 +0300 Size: 6509 URL: -------------- next part -------------- -- Igor Goryachev Yandex development team. From pguyot@REDACTED Sun Sep 7 10:57:40 2008 From: pguyot@REDACTED (Paul Guyot) Date: Sun, 7 Sep 2008 10:57:40 +0200 Subject: [erlang-bugs] Dynamic libraries are not closed on MacOS X [with patch] Message-ID: Hello, On MacOS X, dynamic libraries opened with erl_ddll:load or load_driver are not closed when erl_ddll:unload/unload_driver is called. Steps to reproduce: * build a simple dynamic library, called simple_drv.so * start a new erlang shell in the same directory. * note down the PID. * evaluate erl_ddll:load(".", "simple_drv"). * run lsof -p to see that indeed simple_drv.so file is open. * evaluate erl_ddll:unload("simple_drv"). * run lsof -p to see that simple_drv.so file is still open. This is specific to MacOS X and this is simply because the code hasn't been written. On other Unix implementations, the erl_ddll code calls dlopen and dclose. The attached patch against R12B-4 fixes the bug and was tested on MacOS X 10.4/ppc. Regards, Paul -------------- next part -------------- A non-text attachment was scrubbed... Name: patch-erts_emulator_sys_unix_ddll.c Type: application/octet-stream Size: 807 bytes Desc: not available URL: -------------- next part -------------- From erlang-questions_efine@REDACTED Sun Sep 7 21:23:21 2008 From: erlang-questions_efine@REDACTED (Edwin Fine) Date: Sun, 7 Sep 2008 15:23:21 -0400 Subject: [erlang-bugs] Follow up: [BUG] gen_tcp:connect/3, 4 returns socket for closed port Message-ID: <6c2563b20809071223t2cf8b4cat946107d1ea2f8920@mail.gmail.com> Hi OTP Team, I realize you have been very busy with the R12B-4 release, and this is not a complaint or criticism, just a request for info. I reported this bug some weeks ago and have not received an acknowledgment. I simply want to know if you accepted it, rejected it, or fixed it already (and if so, in which release the fix appears). I have had to code around this and would like to know if I can remove that code. Link to original bug report: http://www.erlang.org/pipermail/erlang-bugs/2008-August/000931.html Best regards, Edwin Fine -------------- next part -------------- An HTML attachment was scrubbed... URL: From raimo+erlang-bugs@REDACTED Mon Sep 8 12:00:37 2008 From: raimo+erlang-bugs@REDACTED (Raimo Niskanen) Date: Mon, 8 Sep 2008 12:00:37 +0200 Subject: [erlang-bugs] Follow up: [BUG] gen_tcp:connect/3, 4 returns socket for closed port In-Reply-To: <6c2563b20809071223t2cf8b4cat946107d1ea2f8920@mail.gmail.com> References: <6c2563b20809071223t2cf8b4cat946107d1ea2f8920@mail.gmail.com> Message-ID: <20080908100037.GA1280@erix.ericsson.se> On Sun, Sep 07, 2008 at 03:23:21PM -0400, Edwin Fine wrote: > Hi OTP Team, > > I realize you have been very busy with the R12B-4 release, and this is not a > complaint or criticism, just a request for info. Perhaps it should be... > > I reported this bug some weeks ago and have not received an acknowledgment. > I simply want to know if you accepted it, rejected it, or fixed it already You are right, we have been busy with the release. Your problem (as we say in swedish) fell between the chairs. If it is an inet_drv bug it is one guys problem, an SMP bug another guys problem. But enough excuses... we will look into it now. It sounds serious. > (and if so, in which release the fix appears). I have had to code around > this and would like to know if I can remove that code. > > Link to original bug report: > http://www.erlang.org/pipermail/erlang-bugs/2008-August/000931.html > > Best regards, > Edwin Fine > _______________________________________________ > erlang-bugs mailing list > erlang-bugs@REDACTED > http://www.erlang.org/mailman/listinfo/erlang-bugs -- / Raimo Niskanen, Erlang/OTP, Ericsson AB From erlang-questions_efine@REDACTED Mon Sep 8 14:43:22 2008 From: erlang-questions_efine@REDACTED (Edwin Fine) Date: Mon, 8 Sep 2008 08:43:22 -0400 Subject: [erlang-bugs] Follow up: [BUG] gen_tcp:connect/3, 4 returns socket for closed port In-Reply-To: <20080908100037.GA1280@erix.ericsson.se> References: <6c2563b20809071223t2cf8b4cat946107d1ea2f8920@mail.gmail.com> <20080908100037.GA1280@erix.ericsson.se> Message-ID: <6c2563b20809080543q6a157b6at2543bba96ffe9cd5@mail.gmail.com> Raimo, Thanks for the response. Good luck finding the bug. I just confirmed that it is still present on R12B-4. Please note that you need to connect to a port that is open but with no program using it (e.g. one could try port 80 without httpd running). Sorry to state the obvious, it's a bad habit of mine. Regards, Edwin Fine On Mon, Sep 8, 2008 at 6:00 AM, Raimo Niskanen < raimo+erlang-bugs@REDACTED >wrote: > On Sun, Sep 07, 2008 at 03:23:21PM -0400, Edwin Fine wrote: > > Hi OTP Team, > > > > I realize you have been very busy with the R12B-4 release, and this is > not a > > complaint or criticism, just a request for info. > > Perhaps it should be... > > > > > I reported this bug some weeks ago and have not received an > acknowledgment. > > I simply want to know if you accepted it, rejected it, or fixed it > already > > You are right, we have been busy with the release. > > Your problem (as we say in swedish) fell between the chairs. > If it is an inet_drv bug it is one guys problem, an SMP bug > another guys problem. But enough excuses... > we will look into it now. It sounds serious. > > > (and if so, in which release the fix appears). I have had to code around > > this and would like to know if I can remove that code. > > > > Link to original bug report: > > http://www.erlang.org/pipermail/erlang-bugs/2008-August/000931.html > > > > Best regards, > > Edwin Fine > > > _______________________________________________ > > erlang-bugs mailing list > > erlang-bugs@REDACTED > > http://www.erlang.org/mailman/listinfo/erlang-bugs > > -- > > / Raimo Niskanen, Erlang/OTP, Ericsson AB > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From raimo+erlang-bugs@REDACTED Tue Sep 9 09:12:43 2008 From: raimo+erlang-bugs@REDACTED (Raimo Niskanen) Date: Tue, 9 Sep 2008 09:12:43 +0200 Subject: [erlang-bugs] : Follow up: [BUG] gen_tcp:connect/3, 4 returns socket for closed port In-Reply-To: <6c2563b20809080543q6a157b6at2543bba96ffe9cd5@mail.gmail.com> References: <6c2563b20809071223t2cf8b4cat946107d1ea2f8920@mail.gmail.com> <20080908100037.GA1280@erix.ericsson.se> <6c2563b20809080543q6a157b6at2543bba96ffe9cd5@mail.gmail.com> Message-ID: <20080909071243.GA30749@erix.ericsson.se> On Mon, Sep 08, 2008 at 08:43:22AM -0400, Edwin Fine wrote: > Raimo, > > Thanks for the response. Good luck finding the bug. I just confirmed that it > is still present on R12B-4. Please note that you need to connect to a port > that is open but with no program using it (e.g. one could try port 80 > without httpd running). Sorry to state the obvious, it's a bad habit of > mine. Nono, please state the obvious. People often leave out the obvious, that they think. And it turns out to be non-obvous. But on the other hand it may be confusing too. Do you mean that it must be a port that is open in the firewall but have no listening socket so you get the RST response from the TCP stack on the target machine (that is supposed to be Windows XP. Have you tried other targets? Since you report having seen SYN in RST out on the target for all connection attempts it should not matter). > > Regards, > Edwin Fine > > On Mon, Sep 8, 2008 at 6:00 AM, Raimo Niskanen < > raimo+erlang-bugs@REDACTED >wrote: > > > On Sun, Sep 07, 2008 at 03:23:21PM -0400, Edwin Fine wrote: > > > Hi OTP Team, > > > > > > I realize you have been very busy with the R12B-4 release, and this is > > not a > > > complaint or criticism, just a request for info. > > > > Perhaps it should be... > > > > > > > > I reported this bug some weeks ago and have not received an > > acknowledgment. > > > I simply want to know if you accepted it, rejected it, or fixed it > > already > > > > You are right, we have been busy with the release. > > > > Your problem (as we say in swedish) fell between the chairs. > > If it is an inet_drv bug it is one guys problem, an SMP bug > > another guys problem. But enough excuses... > > we will look into it now. It sounds serious. > > > > > (and if so, in which release the fix appears). I have had to code around > > > this and would like to know if I can remove that code. > > > > > > Link to original bug report: > > > http://www.erlang.org/pipermail/erlang-bugs/2008-August/000931.html > > > > > > Best regards, > > > Edwin Fine > > > > > _______________________________________________ > > > erlang-bugs mailing list > > > erlang-bugs@REDACTED > > > http://www.erlang.org/mailman/listinfo/erlang-bugs > > > > -- > > > > / Raimo Niskanen, Erlang/OTP, Ericsson AB > > > > > _______________________________________________ > erlang-bugs mailing list > erlang-bugs@REDACTED > http://www.erlang.org/mailman/listinfo/erlang-bugs -- / Raimo Niskanen, Erlang/OTP, Ericsson AB From erlang-questions_efine@REDACTED Tue Sep 9 12:47:25 2008 From: erlang-questions_efine@REDACTED (Edwin Fine) Date: Tue, 9 Sep 2008 06:47:25 -0400 Subject: [erlang-bugs] : Follow up: [BUG] gen_tcp:connect/3, 4 returns socket for closed port In-Reply-To: <20080909071243.GA30749@erix.ericsson.se> References: <6c2563b20809071223t2cf8b4cat946107d1ea2f8920@mail.gmail.com> <20080908100037.GA1280@erix.ericsson.se> <6c2563b20809080543q6a157b6at2543bba96ffe9cd5@mail.gmail.com> <20080909071243.GA30749@erix.ericsson.se> Message-ID: <6c2563b20809090347s6ee29ebfq7f2c39e476627807@mail.gmail.com> Raimo, Yes, it must be a port that is open in the firewall but have no listening socket. I have not tried it on other targets (I only have Windows and Linux, and I tried connecting from Linux to Windows XP). Hope this helps. On Tue, Sep 9, 2008 at 3:12 AM, Raimo Niskanen < raimo+erlang-bugs@REDACTED >wrote: > On Mon, Sep 08, 2008 at 08:43:22AM -0400, Edwin Fine wrote: > > Raimo, > > > > Thanks for the response. Good luck finding the bug. I just confirmed that > it > > is still present on R12B-4. Please note that you need to connect to a > port > > that is open but with no program using it (e.g. one could try port 80 > > without httpd running). Sorry to state the obvious, it's a bad habit of > > mine. > > Nono, please state the obvious. People often leave out the > obvious, that they think. And it turns out to be non-obvous. > > But on the other hand it may be confusing too. Do you mean > that it must be a port that is open in the firewall > but have no listening socket so you get the RST response > from the TCP stack on the target machine (that is supposed > to be Windows XP. Have you tried other targets? Since you > report having seen SYN in RST out on the target for all > connection attempts it should not matter). > > > > > Regards, > > Edwin Fine > > > > On Mon, Sep 8, 2008 at 6:00 AM, Raimo Niskanen < > > raimo+erlang-bugs@REDACTED< > raimo%2Berlang-bugs@REDACTED > >>wrote: > > > > > On Sun, Sep 07, 2008 at 03:23:21PM -0400, Edwin Fine wrote: > > > > Hi OTP Team, > > > > > > > > I realize you have been very busy with the R12B-4 release, and this > is > > > not a > > > > complaint or criticism, just a request for info. > > > > > > Perhaps it should be... > > > > > > > > > > > I reported this bug some weeks ago and have not received an > > > acknowledgment. > > > > I simply want to know if you accepted it, rejected it, or fixed it > > > already > > > > > > You are right, we have been busy with the release. > > > > > > Your problem (as we say in swedish) fell between the chairs. > > > If it is an inet_drv bug it is one guys problem, an SMP bug > > > another guys problem. But enough excuses... > > > we will look into it now. It sounds serious. > > > > > > > (and if so, in which release the fix appears). I have had to code > around > > > > this and would like to know if I can remove that code. > > > > > > > > Link to original bug report: > > > > http://www.erlang.org/pipermail/erlang-bugs/2008-August/000931.html > > > > > > > > Best regards, > > > > Edwin Fine > > > > > > > _______________________________________________ > > > > erlang-bugs mailing list > > > > erlang-bugs@REDACTED > > > > http://www.erlang.org/mailman/listinfo/erlang-bugs > > > > > > -- > > > > > > / Raimo Niskanen, Erlang/OTP, Ericsson AB > > > > > > > > > _______________________________________________ > > erlang-bugs mailing list > > erlang-bugs@REDACTED > > http://www.erlang.org/mailman/listinfo/erlang-bugs > > -- > > / Raimo Niskanen, Erlang/OTP, Ericsson AB > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From raimo+erlang-bugs@REDACTED Tue Sep 9 13:32:07 2008 From: raimo+erlang-bugs@REDACTED (Raimo Niskanen) Date: Tue, 9 Sep 2008 13:32:07 +0200 Subject: [erlang-bugs] : : Follow up: [BUG] gen_tcp:connect/3, 4 returns socket for closed port In-Reply-To: <6c2563b20809090347s6ee29ebfq7f2c39e476627807@mail.gmail.com> References: <6c2563b20809071223t2cf8b4cat946107d1ea2f8920@mail.gmail.com> <20080908100037.GA1280@erix.ericsson.se> <6c2563b20809080543q6a157b6at2543bba96ffe9cd5@mail.gmail.com> <20080909071243.GA30749@erix.ericsson.se> <6c2563b20809090347s6ee29ebfq7f2c39e476627807@mail.gmail.com> Message-ID: <20080909113207.GA2289@erix.ericsson.se> On Tue, Sep 09, 2008 at 06:47:25AM -0400, Edwin Fine wrote: > Raimo, > > Yes, it must be a port that is open in the firewall but have no listening > socket. > I have not tried it on other targets (I only have Windows and Linux, and I > tried connecting from Linux to Windows XP). > > Hope this helps. I can reproduce the bug on a SLES 10 SP 1 x86_64 "Erlang (BEAM) emulator version 5.6.4 [source] [64-bit] [smp:4] [async-threads:0] [hipe] [kernel-poll:false]\n" both towards an XP machine and towards another SLES 10 machine, but oddly enough not against the machine itself neither over the loopback interface nor the external interface. It probably suggests badass timing is involved. I hope debug compiled still shows the symptom. I'll be back... > > On Tue, Sep 9, 2008 at 3:12 AM, Raimo Niskanen < > raimo+erlang-bugs@REDACTED >wrote: > > > On Mon, Sep 08, 2008 at 08:43:22AM -0400, Edwin Fine wrote: > > > Raimo, > > > > > > Thanks for the response. Good luck finding the bug. I just confirmed that > > it > > > is still present on R12B-4. Please note that you need to connect to a > > port > > > that is open but with no program using it (e.g. one could try port 80 > > > without httpd running). Sorry to state the obvious, it's a bad habit of > > > mine. > > > > Nono, please state the obvious. People often leave out the > > obvious, that they think. And it turns out to be non-obvous. > > > > But on the other hand it may be confusing too. Do you mean > > that it must be a port that is open in the firewall > > but have no listening socket so you get the RST response > > from the TCP stack on the target machine (that is supposed > > to be Windows XP. Have you tried other targets? Since you > > report having seen SYN in RST out on the target for all > > connection attempts it should not matter). > > > > > > > > Regards, > > > Edwin Fine > > > > > > On Mon, Sep 8, 2008 at 6:00 AM, Raimo Niskanen < > > > raimo+erlang-bugs@REDACTED< > > raimo%2Berlang-bugs@REDACTED > > >>wrote: > > > > > > > On Sun, Sep 07, 2008 at 03:23:21PM -0400, Edwin Fine wrote: > > > > > Hi OTP Team, > > > > > > > > > > I realize you have been very busy with the R12B-4 release, and this > > is > > > > not a > > > > > complaint or criticism, just a request for info. > > > > > > > > Perhaps it should be... > > > > > > > > > > > > > > I reported this bug some weeks ago and have not received an > > > > acknowledgment. > > > > > I simply want to know if you accepted it, rejected it, or fixed it > > > > already > > > > > > > > You are right, we have been busy with the release. > > > > > > > > Your problem (as we say in swedish) fell between the chairs. > > > > If it is an inet_drv bug it is one guys problem, an SMP bug > > > > another guys problem. But enough excuses... > > > > we will look into it now. It sounds serious. > > > > > > > > > (and if so, in which release the fix appears). I have had to code > > around > > > > > this and would like to know if I can remove that code. > > > > > > > > > > Link to original bug report: > > > > > http://www.erlang.org/pipermail/erlang-bugs/2008-August/000931.html > > > > > > > > > > Best regards, > > > > > Edwin Fine > > > > > > > > > _______________________________________________ > > > > > erlang-bugs mailing list > > > > > erlang-bugs@REDACTED > > > > > http://www.erlang.org/mailman/listinfo/erlang-bugs > > > > > > > > -- > > > > > > > > / Raimo Niskanen, Erlang/OTP, Ericsson AB > > > > > > > > > > > > > _______________________________________________ > > > erlang-bugs mailing list > > > erlang-bugs@REDACTED > > > http://www.erlang.org/mailman/listinfo/erlang-bugs > > > > -- > > > > / Raimo Niskanen, Erlang/OTP, Ericsson AB > > > > > _______________________________________________ > erlang-bugs mailing list > erlang-bugs@REDACTED > http://www.erlang.org/mailman/listinfo/erlang-bugs -- / Raimo Niskanen, Erlang/OTP, Ericsson AB From mog-lists@REDACTED Thu Sep 11 12:54:51 2008 From: mog-lists@REDACTED (mog) Date: Thu, 11 Sep 2008 05:54:51 -0500 Subject: [erlang-bugs] rsa upgrades for crypto Message-ID: <20080911105451.GA13742@metalman.digium.internal> Someone back at r11b0 released a patch for adding md5 and sha rsa signing and verifying, http://www.nabble.com/Patch:-Add-MD5-and-SHA1-sign-verify-functions-td5681721.html I have updated it for r12b4, I would hope it would be considered as i need rsa_sha_signing and verifying in an app i am using Mog -------------- next part -------------- A non-text attachment was scrubbed... Name: crypto.patch Type: text/x-diff Size: 8700 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 197 bytes Desc: Digital signature URL: From dgud@REDACTED Thu Sep 11 13:24:18 2008 From: dgud@REDACTED (Dan Gudmundsson) Date: Thu, 11 Sep 2008 13:24:18 +0200 Subject: [erlang-bugs] rsa upgrades for crypto In-Reply-To: <20080911105451.GA13742@metalman.digium.internal> References: <20080911105451.GA13742@metalman.digium.internal> Message-ID: <48C8FFE2.5040902@erix.ericsson.se> I will add something like that, I googled the patch yesterday.. But I liked rsa_verify(DigestType, Dgst, Sign, Key). rsa_sign(DigestType, Data, Key). better where DigestType is md5 or sha. /Dan mog wrote: > Someone back at r11b0 released a patch for adding md5 and sha rsa signing and verifying, http://www.nabble.com/Patch:-Add-MD5-and-SHA1-sign-verify-functions-td5681721.html I have updated it for r12b4, I would hope it would be considered as i need rsa_sha_signing and verifying in an app i am using > > Mog > > > ------------------------------------------------------------------------ > > _______________________________________________ > erlang-bugs mailing list > erlang-bugs@REDACTED > http://www.erlang.org/mailman/listinfo/erlang-bugs From mog-lists@REDACTED Thu Sep 11 15:59:04 2008 From: mog-lists@REDACTED (mog) Date: Thu, 11 Sep 2008 08:59:04 -0500 Subject: [erlang-bugs] rsa upgrades for crypto In-Reply-To: <48C8FFE2.5040902@erix.ericsson.se> References: <20080911105451.GA13742@metalman.digium.internal> <48C8FFE2.5040902@erix.ericsson.se> Message-ID: <20080911135903.GA4420@metalman.digium.internal> On Thu, Sep 11, 2008 at 01:24:18PM +0200, Dan Gudmundsson wrote: > I will add something like that, I googled the patch yesterday.. > > But I liked > rsa_verify(DigestType, Dgst, Sign, Key). > rsa_sign(DigestType, Data, Key). > better where DigestType is md5 or sha. That sounds great I would love to test it when your done with it ^_^, Also the key format used in this patch was different then the standard key you read out of a file, as it didn't need all the information, for ease of use could you use the same key type used through out crypto? Mog -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 197 bytes Desc: Digital signature URL: From michael.regen@REDACTED Fri Sep 12 15:14:46 2008 From: michael.regen@REDACTED (Michael Regen) Date: Fri, 12 Sep 2008 15:14:46 +0200 Subject: [erlang-bugs] EPMD protocol documentation Message-ID: <9b59d0270809120614l495a77b3ja0f5ac4057f280f0@mail.gmail.com> Hi, I think there are a couple of problems with the documentation of the EPMD protocol at http://erlang.org/doc/apps/erts/erl_dist_protocol.html#9.1 (in R12B-3 and R12B-4) as well as erts/emulator/internal_doc/erl_ext_dist.txt (in R12B-3): * ALIVE2_REQ: DistrvsnRange is 4 bytes, not 2. "Four bytes where MSB (2 bytes) = Highestvsn and LSB (2 bytes) = Lowestvsn. For erts-4.6.x (OTP-R3)the vsn = 0 For erts-4.7.x (OTP-R4) = ?????." * PORT2_RESP: Elen is described as 2 byte field. But at least if during ALIVE2_REQ no extra field was provided (as I think erts usually does) then PORT2_RESP just returns Elen as one byte = 0. And erts does not seem to work correctly if we send back a packet as specified. * NAMES_RESP: In the documentation it looks like one packet should be sent back containing the whole answer whereas in reality EPMDPortNo and each answer are sent back in different packets. * DUMP_RESP: Same as for NAMES_RESP: different packets are expected. Furthermore the documentation speficies NodeInfo as expressed in Erlang: io:format("active name ~s at port ~p, fd = ~p ~n", [NodeName, Port, Fd]). for registered nodes. Correct would be: io:format("active name <~s> at port ~p, fd = ~p~n", [NodeName, Port, Fd]). Notice the <> characters! For unregistered nodes: io:format("old/unused name ~s at port ~p, fd = ~p~n", [NodeName, Port, Fd]). io:format("old/unused name <~s>, port = ~p, fd = ~p ~n", [NodeName, Port, Fd]). Notice the <> characters as well as the last space before the new line! Furthermore it might be good to mention that all answers are followed by a close of the socket except for ALIVE2_RESP. Might also be good to mention that all integers are in big-endian format. Thank you! Regards, Michael -------------- next part -------------- An HTML attachment was scrubbed... URL: From michael.regen@REDACTED Sat Sep 13 18:30:02 2008 From: michael.regen@REDACTED (Michael Regen) Date: Sat, 13 Sep 2008 18:30:02 +0200 Subject: [erlang-bugs] Erlang emulator crash, gen_tcp related (probably only under Windows) Message-ID: <9b59d0270809130930g260f8ca8wc19f4d20a5802655@mail.gmail.com> Hi, I run into Erlang emulator crashes when I do basic gen_tcp operations. My code crashes with the message: Crash dump was written to: erl_crash.dump Inconsistent, why isnt io reported? Abnormal termination without any significant error message before. The problem occures on Windows XP. I am not sure whether Linux is affected as well but short tests showed no problems there. To reproduce the problem: 1) tcp_test.erl Is a simple gen_tcp client which spawns processes which connect to a port, send a few bytes, try to get the answer and close the port: -------------------------- start: tcp_test.erl -------------------------- -module(tcp_test). -export([test/1, test_con/0]). -define(DEF_PORT, 2222). -define(DEF_IP, {127,0,0,1}). test(0) -> ok; test(HowManyProcs) -> spawn(?MODULE, test_con, []), test(HowManyProcs-1). test_con() -> {ok,S} = gen_tcp:connect(?DEF_IP, ?DEF_PORT,[]), gen_tcp:send(S,<<0,5,65,66,67,68,69>>), receive {tcp_closed, _Socket} -> ok; _Msg -> gen_tcp:close(S) after 500 -> gen_tcp:close(S) end. -------------------------- end: tcp_test.erl -------------------------- 2) tcp_server_app Please just take the code from the trapexit tutorial 'Building a Non-blocking TCP server using OTP principles' http://trapexit.org/Building_a_Non-blocking_TCP_server_using_OTP_principles There is a size limit on the erlang-bugs mailing list which is why I do not send the whole code + crash dump as attachment. Please give me a note if you want it! To start the client: werl.exe tcp_test:test(1000). % 1000 is the number of processes to start. Please play with higher numbers as well! To start the server: werl.exe application:start(tcp_server). I tested with running the client and server in different Erlang nodes. You should be able to crash the emulator by: a) only running the client without anything listening on the port 2222 b) running the client together with the server (client crashes) c) running the client together with the server (server crashes) d) running the client together with the server (client + server crash at the same time) It might take some attempts and in some cases starting the client with even 10.000 processes (tcp_test:test(10000)). But usually only very few attempts (one) are necessary. An interesting observation: If you run the tests under erl.exe (instead of werl.exe) it takes significant more processes/tries to reach the crash. Furthermore erl.exe (or is it cmd.exe?) crashes with a: The exception unknown software exception (0x40000015) occured in the application at location 0x008fff86 The location seems to be always the same. The problem might be timing related. Tests where done on R12B-3 and R12B-4 on Windows XP SP2 and SP3 systems. Hardware: Athlon XP64 in 32-bit mode, 1GB Ram, Centrino Notebook 1GB Ram, Intel E6600 dual-core in 32-bit mode, 4 GB of RAM. So, the problem might occure on SMP and non-SMP systems. During various tests you might see the following errors in the client node: {{badmatch,{error,econnrefused}} % this is expected {{badmatch,{error,eaddrinuse}} {{badmatch,{error,system_limit}} If you start the server with sasl enabled you might under rare circumstances see the following error messages as well: -------------------------- start: log server -------------------------- =ERROR REPORT==== 12-Sep-2008::12:58:56 === File operation error: system_limit. Function: get_cwd. Process: code_server. =ERROR REPORT==== 12-Sep-2008::12:58:56 === Error in async accept: {async_accept,"file table overflow"}. =ERROR REPORT==== 12-Sep-2008::12:58:56 === ** Generic server tcp_listener terminating ** Last message in was {inet_async,#Port<0.109>,1019,{ok,#Port<0.2141>}} ** When Server state == {state,#Port<0.109>,1019,tcp_echo_fsm} ** Reason for termination == ** {async_accept,"file table overflow"} [...] -------------------------- end: log server -------------------------- Running out of ephemeral ports (all user ports in TIME_WAIT) should not be the problem since it also occures with the registry key HKEY_LOCAL_MACHINE\SYSTEM\ CurrentControlSet\Services\Tcpip\Parameters\MaxUserPort set to 60000 and after only 2000 processes. Local firewall did not affect the outcome (turned on/off). There is a thread on erlang-questions which might in the future contain additional information: http://erlang.org/pipermail/erlang-questions/2008-September/038118.html Thank you! Regards, Michael -------------- next part -------------- An HTML attachment was scrubbed... URL: From kenji.rikitake@REDACTED Sun Sep 14 03:41:11 2008 From: kenji.rikitake@REDACTED (Kenji Rikitake) Date: Sun, 14 Sep 2008 10:41:11 +0900 Subject: [erlang-bugs] leap-second-enabled FreeBSD doesn't work right with R12B4 erts/emulator/beam/erl_time_sup.c; correction patch included Message-ID: <20080914014111.GA7026@k2r.org> A patch to correct erlang:universaltime_to_localtime/1 for FreeBSD running leap-second-enabled timezone by Kenji Rikitake 14-SEP-2008 * Summary This patch fixes the time calculation problem of FreeBSD 6.x and 7.x, which has the internal leap-second correction enabled. This patch is tested with Erlang/OTP R12B-4 source distribution. * Symptom Without this patch, erlang:localtime_to_universaltime/1 and erlang:universaltime_to_localtime/1 are not symmetric and will break calendar:local_time_to_universal_time_dst_1/1 and httpd_util:rfc1123_date/1 * Example of symptom: (under where local time is GMT + 9 hours) 1> erlang:localtime_to_universaltime({{2008,9,1},{12,0,0}}). {{2008,9,1},{3,0,0}} 2> erlang:universaltime_to_localtime({{2008,9,1},{3,0,0}}). {{2008,9,1},{11,59,37}} (Note that as of September 1, 2008, TAI - UTC = 33 seconds. UNIX time_t with TAI correction is 10 seconds ahead of UTC. So the 23-second difference occurs when the leap-second correction is NOT performed, as in the C function of univ_to_local() in erts/emulator/beam/erl_time_sup.c) * Workaround in this patch This patch changes the operation of erlang:universaltime_to_localtime/1 so that the "universaltime" is handled properly with leap-year correction. (Note: OS time_t is in TAI) See FreeBSD man time2posix(3) and /usr/src/lib/libc/stdtime/localtime.c for the further details. This patch will NOT affect a FreeBSD machine without leap-year correction; posix2time() will do nothing in such a situation. (NOT tested though) * Caveats, TODO and suggestions There is no portable way to do this among UNIX-derived OSes HAVE_POSIX2TIME should be set by configure A Linux patch for leap-second supported systems is highly desired (I found posix2time() is also available on Linux, but I'm not sure) Should I rather do this using timegm(3) and throw away the rather naive computation algorithm in univ_to_local()? (Note that this is not necessarily portable either) * How to apply this patch Apply this at Erlang R12B4 source tree's directory under: erts/emulator/beam --- erl_time_sup.c.FCS 2008-04-07 22:57:50.000000000 +0900 +++ erl_time_sup.c 2008-09-14 09:56:10.000000000 +0900 @@ -71,6 +71,10 @@ ** */ +/* FreeBSD internal leap year correction function */ +/* define this for FreeBSD 6.x and 7.x */ +#define HAVE_POSIX2TIME + #ifdef HAVE_CONFIG_H # include "config.h" #endif @@ -686,6 +690,18 @@ the_clock = *second + 60 * (*minute + 60 * (*hour + 24 * gregday(*year, *month, *day))); +#ifdef HAVE_POSIX2TIME + /* + * leap-second correction performed + * if system is configured so; + * do nothing if not + * See FreeBSD 6.x and 7.x + * /usr/src/lib/libc/stdtime/localtime.c + * for the details + */ + the_clock = posix2time(the_clock); +#endif + #ifdef HAVE_LOCALTIME_R localtime_r(&the_clock, (tm = &tmbuf)); #else From matthew.dempsky@REDACTED Sun Sep 14 05:03:53 2008 From: matthew.dempsky@REDACTED (Matthew Dempsky) Date: Sat, 13 Sep 2008 20:03:53 -0700 Subject: [erlang-bugs] leap-second-enabled FreeBSD doesn't work right with R12B4 erts/emulator/beam/erl_time_sup.c; correction patch included In-Reply-To: <20080914014111.GA7026@k2r.org> References: <20080914014111.GA7026@k2r.org> Message-ID: On Sat, Sep 13, 2008 at 6:41 PM, Kenji Rikitake wrote: > Should I rather do this using timegm(3) and throw away > the rather naive computation algorithm in univ_to_local()? > (Note that this is not necessarily portable either) mktime(3) is defined by POSIX. From kenji.rikitake@REDACTED Sun Sep 14 05:12:10 2008 From: kenji.rikitake@REDACTED (Kenji Rikitake) Date: Sun, 14 Sep 2008 12:12:10 +0900 Subject: [erlang-bugs] leap-second-enabled FreeBSD doesn't work right with R12B4 erts/emulator/beam/erl_time_sup.c; correction patch included In-Reply-To: References: <20080914014111.GA7026@k2r.org> Message-ID: <20080914031210.GA9162@k2r.org> In the message dated Sat, Sep 13, 2008 at 08:03:30PM -0700, Matthew Dempsky writes: > > Should I rather do this using timegm(3) and throw away > > the rather naive computation algorithm in univ_to_local()? > > (Note that this is not necessarily portable either) > > mktime(3) is defined by POSIX. mktime() certainly is, but timegm() isn't, AFAIK. An example implementation in the following URL will not be thread-safe, due to the usage of getenv() and setenv(). http://developer.apple.com/documentation/Darwin/Reference/ManPages/man3/timegm.3.html BTW local_to_univ() in erl_time_sup.c uses mktime(), and I see no problem with it. Kenji Rikitake From matthew.dempsky@REDACTED Sun Sep 14 06:00:43 2008 From: matthew.dempsky@REDACTED (Matthew Dempsky) Date: Sat, 13 Sep 2008 21:00:43 -0700 Subject: [erlang-bugs] leap-second-enabled FreeBSD doesn't work right with R12B4 erts/emulator/beam/erl_time_sup.c; correction patch included In-Reply-To: <20080914031210.GA9162@k2r.org> References: <20080914014111.GA7026@k2r.org> <20080914031210.GA9162@k2r.org> Message-ID: On Sat, Sep 13, 2008 at 8:12 PM, Kenji Rikitake wrote: > mktime() certainly is, but timegm() isn't, AFAIK. My mistake. I confused the local-vs-UTC time functions. time2posix(3) and posix2time(3) come from Olson's tz library, which according to Wikipedia[1], is pretty widely used, so they seem like they should be reasonably portable. Anywhere that doesn't have them probably isn't leap second aware anyways. They're also present in FreeBSD, NetBSD, and OpenBSD's cvs trees from over 10 years ago, so they're hardly recent additions either. [1] http://en.wikipedia.org/wiki/Zoneinfo#Use_in_software_systems From kenji.rikitake@REDACTED Sun Sep 14 07:24:18 2008 From: kenji.rikitake@REDACTED (Kenji Rikitake) Date: Sun, 14 Sep 2008 14:24:18 +0900 Subject: [erlang-bugs] leap-second-enabled FreeBSD doesn't work right with R12B4 erts/emulator/beam/erl_time_sup.c; correction patch included In-Reply-To: References: <20080914014111.GA7026@k2r.org> <20080914031210.GA9162@k2r.org> Message-ID: <20080914052418.GA10612@k2r.org> I also think using time2posix(3) and posix2time(3) will be a better way than timegm(), as Matthew describes. I've also found out that the following three functions calendar:now_to_datetime/1 calendar:now_to_universal_time/1 (equivalent to calendar:now_to_datetime/1) calendar:now_to_local_time/1 do not work in the leap-second-enabled environment, due to the fact that erlang:now/0 shows the internal clock value as is, with gettimeofday(2). The converted results of the functions include the offset of ((TAI-UTC) - 10) seconds. Modifying erlang:now/0, defined as get_now() in erl_time_sup.c to include the offset of time2posix(3) is a possible solution, though I don't feel like to doing it because it will surely break the assumption of continuous monotonous increasing of erlang:now/0. Fixing only the calendar module functions by adding a time2posix(3) calculation routine written in C somewhere in the BEAM BIFs looks better to me, though I need to investigate further. Regards, Kenji Rikitake In the message dated Sat, Sep 13, 2008 at 09:00:20PM -0700, Matthew Dempsky writes: > On Sat, Sep 13, 2008 at 8:12 PM, Kenji Rikitake wrote: > > mktime() certainly is, but timegm() isn't, AFAIK. > > My mistake. I confused the local-vs-UTC time functions. > > time2posix(3) and posix2time(3) come from Olson's tz library, which > according to Wikipedia[1], is pretty widely used, so they seem like > they should be reasonably portable. Anywhere that doesn't have them > probably isn't leap second aware anyways. They're also present in > FreeBSD, NetBSD, and OpenBSD's cvs trees from over 10 years ago, so > they're hardly recent additions either. > > [1] http://en.wikipedia.org/wiki/Zoneinfo#Use_in_software_systems > From raimo+erlang-bugs@REDACTED Mon Sep 15 09:56:08 2008 From: raimo+erlang-bugs@REDACTED (Raimo Niskanen) Date: Mon, 15 Sep 2008 09:56:08 +0200 Subject: [erlang-bugs] : : : Follow up: [BUG] gen_tcp:connect/3, 4 returns socket for closed port In-Reply-To: <20080909113207.GA2289@erix.ericsson.se> References: <6c2563b20809071223t2cf8b4cat946107d1ea2f8920@mail.gmail.com> <20080908100037.GA1280@erix.ericsson.se> <6c2563b20809080543q6a157b6at2543bba96ffe9cd5@mail.gmail.com> <20080909071243.GA30749@erix.ericsson.se> <6c2563b20809090347s6ee29ebfq7f2c39e476627807@mail.gmail.com> <20080909113207.GA2289@erix.ericsson.se> Message-ID: <20080915075608.GA14206@erix.ericsson.se> On Tue, Sep 09, 2008 at 01:32:07PM +0200, Raimo Niskanen wrote: > On Tue, Sep 09, 2008 at 06:47:25AM -0400, Edwin Fine wrote: > > Raimo, > > > > Yes, it must be a port that is open in the firewall but have no listening > > socket. > > I have not tried it on other targets (I only have Windows and Linux, and I > > tried connecting from Linux to Windows XP). > > > > Hope this helps. > > I can reproduce the bug on a SLES 10 SP 1 x86_64 > "Erlang (BEAM) emulator version 5.6.4 [source] [64-bit] [smp:4] [async-threads:0] [hipe] [kernel-poll:false]\n" > both towards an XP machine and towards another SLES 10 machine, > but oddly enough not against the machine itself neither over > the loopback interface nor the external interface. It probably > suggests badass timing is involved. I hope debug compiled > still shows the symptom. > > I'll be back... > Well, it was not even a clear-cut problem. It turned out to be a known problem. We ran into it a few months ago and the solution then was to ignore the problem i.e workaround in the testcases. It was assumed we had found a Linux kernel bug. We do as supposed. If connect() for a non-blocking socket fails with EINPROGRESS we put it in the poll() set and call poll(). Later poll() returns with POLLERR|POLLHUP on the socket. We call getsockopt(,SOL_SOCKET,SO_ERROR,,) to check if the connect succeeded, so far all is as in the manual, but sometimes it succeeds but the socket is unusable. All recv() and sendmsg(), etc fails. The symptoms was also not that bad. Any subsequent usage of the sockets fails, which a real application will have to be prepared to anyway. But taking a closer lock with strace reveals that we call connect() in one thread, poll() in another and getsockopt() in a third. Sometimes, and sometimes all in the same thread. This task wanders between the schedulers in our SMP VM. And when the problem starts it seems poll() returns with POLLOUT|POLLHUP for the socket before we call connect() in another thread, which is temporally impossible. I have seen this in one strace and can not reproduce it. while strace is running the bug does not show itself. So, whe have the possibilites: 1) A bug of ours where we mess up with the locking and loading of data for the poll set. 2) A Linux kernel bug in this rare case of tossing the task between threads. 3) An strace bug for SMP. Its view of the timeline is not necessarily correct. I'll dig further. I might write a small C program to try to provoke the Linux kernel bug, and if it does not provoke it, it is our bug. > > > > On Tue, Sep 9, 2008 at 3:12 AM, Raimo Niskanen < > > raimo+erlang-bugs@REDACTED >wrote: > > > > > On Mon, Sep 08, 2008 at 08:43:22AM -0400, Edwin Fine wrote: > > > > Raimo, > > > > > > > > Thanks for the response. Good luck finding the bug. I just confirmed that > > > it > > > > is still present on R12B-4. Please note that you need to connect to a > > > port > > > > that is open but with no program using it (e.g. one could try port 80 > > > > without httpd running). Sorry to state the obvious, it's a bad habit of > > > > mine. > > > > > > Nono, please state the obvious. People often leave out the > > > obvious, that they think. And it turns out to be non-obvous. > > > > > > But on the other hand it may be confusing too. Do you mean > > > that it must be a port that is open in the firewall > > > but have no listening socket so you get the RST response > > > from the TCP stack on the target machine (that is supposed > > > to be Windows XP. Have you tried other targets? Since you > > > report having seen SYN in RST out on the target for all > > > connection attempts it should not matter). > > > > > > > > > > > Regards, > > > > Edwin Fine > > > > > > > > On Mon, Sep 8, 2008 at 6:00 AM, Raimo Niskanen < > > > > raimo+erlang-bugs@REDACTED< > > > raimo%2Berlang-bugs@REDACTED > > > >>wrote: > > > > > > > > > On Sun, Sep 07, 2008 at 03:23:21PM -0400, Edwin Fine wrote: > > > > > > Hi OTP Team, > > > > > > > > > > > > I realize you have been very busy with the R12B-4 release, and this > > > is > > > > > not a > > > > > > complaint or criticism, just a request for info. > > > > > > > > > > Perhaps it should be... > > > > > > > > > > > > > > > > > I reported this bug some weeks ago and have not received an > > > > > acknowledgment. > > > > > > I simply want to know if you accepted it, rejected it, or fixed it > > > > > already > > > > > > > > > > You are right, we have been busy with the release. > > > > > > > > > > Your problem (as we say in swedish) fell between the chairs. > > > > > If it is an inet_drv bug it is one guys problem, an SMP bug > > > > > another guys problem. But enough excuses... > > > > > we will look into it now. It sounds serious. > > > > > > > > > > > (and if so, in which release the fix appears). I have had to code > > > around > > > > > > this and would like to know if I can remove that code. > > > > > > > > > > > > Link to original bug report: > > > > > > http://www.erlang.org/pipermail/erlang-bugs/2008-August/000931.html > > > > > > > > > > > > Best regards, > > > > > > Edwin Fine > > > > > > > > > > > _______________________________________________ > > > > > > erlang-bugs mailing list > > > > > > erlang-bugs@REDACTED > > > > > > http://www.erlang.org/mailman/listinfo/erlang-bugs > > > > > > > > > > -- > > > > > > > > > > / Raimo Niskanen, Erlang/OTP, Ericsson AB > > > > > > > > > > > > > > > > > _______________________________________________ > > > > erlang-bugs mailing list > > > > erlang-bugs@REDACTED > > > > http://www.erlang.org/mailman/listinfo/erlang-bugs > > > > > > -- > > > > > > / Raimo Niskanen, Erlang/OTP, Ericsson AB > > > > > > > > > _______________________________________________ > > erlang-bugs mailing list > > erlang-bugs@REDACTED > > http://www.erlang.org/mailman/listinfo/erlang-bugs > > -- > > / Raimo Niskanen, Erlang/OTP, Ericsson AB > _______________________________________________ > erlang-bugs mailing list > erlang-bugs@REDACTED > http://www.erlang.org/mailman/listinfo/erlang-bugs -- / Raimo Niskanen, Erlang/OTP, Ericsson AB From raimo+erlang-bugs@REDACTED Mon Sep 15 16:54:10 2008 From: raimo+erlang-bugs@REDACTED (Raimo Niskanen) Date: Mon, 15 Sep 2008 16:54:10 +0200 Subject: [erlang-bugs] : : : : Follow up: [BUG] gen_tcp:connect/3, 4 returns socket for closed port In-Reply-To: <20080915075608.GA14206@erix.ericsson.se> References: <6c2563b20809071223t2cf8b4cat946107d1ea2f8920@mail.gmail.com> <20080908100037.GA1280@erix.ericsson.se> <6c2563b20809080543q6a157b6at2543bba96ffe9cd5@mail.gmail.com> <20080909071243.GA30749@erix.ericsson.se> <6c2563b20809090347s6ee29ebfq7f2c39e476627807@mail.gmail.com> <20080909113207.GA2289@erix.ericsson.se> <20080915075608.GA14206@erix.ericsson.se> Message-ID: <20080915145410.GA20045@erix.ericsson.se> On Mon, Sep 15, 2008 at 09:56:08AM +0200, Raimo Niskanen wrote: > On Tue, Sep 09, 2008 at 01:32:07PM +0200, Raimo Niskanen wrote: > > On Tue, Sep 09, 2008 at 06:47:25AM -0400, Edwin Fine wrote: > > > Raimo, > > > > > > Yes, it must be a port that is open in the firewall but have no listening > > > socket. > > > I have not tried it on other targets (I only have Windows and Linux, and I > > > tried connecting from Linux to Windows XP). > > > > > > Hope this helps. > > > > I can reproduce the bug on a SLES 10 SP 1 x86_64 > > "Erlang (BEAM) emulator version 5.6.4 [source] [64-bit] [smp:4] [async-threads:0] [hipe] [kernel-poll:false]\n" > > both towards an XP machine and towards another SLES 10 machine, > > but oddly enough not against the machine itself neither over > > the loopback interface nor the external interface. It probably > > suggests badass timing is involved. I hope debug compiled > > still shows the symptom. > > > > I'll be back... > > > > Well, it was not even a clear-cut problem. > > It turned out to be a known problem. We ran into it a few months ago > and the solution then was to ignore the problem i.e workaround in > the testcases. It was assumed we had found a Linux kernel bug. > > We do as supposed. If connect() for a non-blocking socket fails > with EINPROGRESS we put it in the poll() set and call poll(). > Later poll() returns with POLLERR|POLLHUP on the socket. > We call getsockopt(,SOL_SOCKET,SO_ERROR,,) to check if > the connect succeeded, so far all is as in the manual, but sometimes > it succeeds but the socket is unusable. All recv() and sendmsg(), > etc fails. > > The symptoms was also not that bad. Any subsequent usage of the > sockets fails, which a real application will have to be > prepared to anyway. > > But taking a closer lock with strace reveals that we call > connect() in one thread, poll() in another and getsockopt() > in a third. Sometimes, and sometimes all in the same thread. > This task wanders between the schedulers in our SMP VM. > > And when the problem starts it seems poll() returns with > POLLOUT|POLLHUP for the socket before we call connect() > in another thread, which is temporally impossible. > I have seen this in one strace and can not reproduce it. > while strace is running the bug does not show itself. > > So, whe have the possibilites: > 1) A bug of ours where we mess up with the locking > and loading of data for the poll set. > 2) A Linux kernel bug in this rare case of tossing > the task between threads. > 3) An strace bug for SMP. Its view of the timeline > is not necessarily correct. > > I'll dig further. > > I might write a small C program to try to provoke the Linux > kernel bug, and if it does not provoke it, it is our bug. > Found the bug! It was a clear and simple bug in the TCP|UDP|SCTP/IP driver inet_drv, where making a connect, the poll set was changed first and then connect was called, and if connect did not give EINPROGRESS, the poll set was reset again. This is the wrong way to do it. The right is to change the poll set if you have to after getting EINPROGRESS. What happened now was: [thread 1] [thread 2] poll( ... socket() -> Socket Change pollset data write(InternalPipe) to inform poll thread ...)poll -> internal pipe POLLIN|POLLRDNORM read(InternalPipe) connect(Socket, ... poll() -> Socket POLLOUT|POLLHUP ... )connect -> EINPROGRESS getsockopt(Socket, SOL_SOCKET, SO_ERROR) -> no error Note that poll returns with ready for writing on Socket before connect returns with EINPROGRESS in the other thread, and Linus only knows why getsockopt() in this particular case returns no error. But we are mishandling the socket, that is for sure. Try this patch: *** /clearcase/otp/erts/erts/emulator/drivers/common/inet_drv.c@@/OTP_R12B-4 2008-09-01 14:51:18.000000000 +0200 --- /clearcase/otp/erts/erts/emulator/drivers/common/inet_drv.c 2008-09-15 16:23:51.000000000 +0200 *************** *** 7239,7257 **** buf, &len) == NULL) return ctl_error(EINVAL, rbuf, rsize); - sock_select(INETP(desc), FD_CONNECT, 1); code = sock_connect(desc->inet.s, (struct sockaddr*) &desc->inet.remote, len); if ((code == SOCKET_ERROR) && ((sock_errno() == ERRNO_BLOCK) || /* Winsock2 */ (sock_errno() == EINPROGRESS))) { /* Unix & OSE!! */ desc->inet.state = TCP_STATE_CONNECTING; if (timeout != INET_INFINITY) driver_set_timer(desc->inet.port, timeout); enq_async(INETP(desc), tbuf, INET_REQ_CONNECT); } else if (code == 0) { /* ok we are connected */ - sock_select(INETP(desc), FD_CONNECT, 0); desc->inet.state = TCP_STATE_CONNECTED; if (desc->inet.active) sock_select(INETP(desc), (FD_READ|FD_CLOSE), 1); --- 7239,7256 ---- buf, &len) == NULL) return ctl_error(EINVAL, rbuf, rsize); code = sock_connect(desc->inet.s, (struct sockaddr*) &desc->inet.remote, len); if ((code == SOCKET_ERROR) && ((sock_errno() == ERRNO_BLOCK) || /* Winsock2 */ (sock_errno() == EINPROGRESS))) { /* Unix & OSE!! */ + sock_select(INETP(desc), FD_CONNECT, 1); desc->inet.state = TCP_STATE_CONNECTING; if (timeout != INET_INFINITY) driver_set_timer(desc->inet.port, timeout); enq_async(INETP(desc), tbuf, INET_REQ_CONNECT); } else if (code == 0) { /* ok we are connected */ desc->inet.state = TCP_STATE_CONNECTED; if (desc->inet.active) sock_select(INETP(desc), (FD_READ|FD_CLOSE), 1); *************** *** 7259,7265 **** async_ok(INETP(desc)); } else { - sock_select(INETP(desc), FD_CONNECT, 0); return ctl_error(sock_errno(), rbuf, rsize); } return ctl_reply(INET_REP_OK, tbuf, 2, rbuf, rsize); --- 7258,7263 ---- The patch has not been run through all our regression tests yet. -- / Raimo Niskanen, Erlang/OTP, Ericsson AB From erlang-questions_efine@REDACTED Mon Sep 15 17:17:42 2008 From: erlang-questions_efine@REDACTED (Edwin Fine) Date: Mon, 15 Sep 2008 11:17:42 -0400 Subject: [erlang-bugs] : : : : Follow up: [BUG] gen_tcp:connect/3, 4 returns socket for closed port In-Reply-To: <20080915145410.GA20045@erix.ericsson.se> References: <6c2563b20809071223t2cf8b4cat946107d1ea2f8920@mail.gmail.com> <20080908100037.GA1280@erix.ericsson.se> <6c2563b20809080543q6a157b6at2543bba96ffe9cd5@mail.gmail.com> <20080909071243.GA30749@erix.ericsson.se> <6c2563b20809090347s6ee29ebfq7f2c39e476627807@mail.gmail.com> <20080909113207.GA2289@erix.ericsson.se> <20080915075608.GA14206@erix.ericsson.se> <20080915145410.GA20045@erix.ericsson.se> Message-ID: <6c2563b20809150817t31ec0f00u80522430e234df54@mail.gmail.com> Raimo, Great work! You found it pretty quickly. I'll apply the patch as soon as possible. I'll wait for a while before applying the patch to production systems until the regression tests are done. Thanks very much. Regards, Edwin Fine On Mon, Sep 15, 2008 at 10:54 AM, Raimo Niskanen < raimo+erlang-bugs@REDACTED >wrote: > On Mon, Sep 15, 2008 at 09:56:08AM +0200, Raimo Niskanen wrote: > > On Tue, Sep 09, 2008 at 01:32:07PM +0200, Raimo Niskanen wrote: > > > On Tue, Sep 09, 2008 at 06:47:25AM -0400, Edwin Fine wrote: > > > > Raimo, > > > > > > > > Yes, it must be a port that is open in the firewall but have no > listening > > > > socket. > > > > I have not tried it on other targets (I only have Windows and Linux, > and I > > > > tried connecting from Linux to Windows XP). > > > > > > > > Hope this helps. > > > > > > I can reproduce the bug on a SLES 10 SP 1 x86_64 > > > "Erlang (BEAM) emulator version 5.6.4 [source] [64-bit] [smp:4] > [async-threads:0] [hipe] [kernel-poll:false]\n" > > > both towards an XP machine and towards another SLES 10 machine, > > > but oddly enough not against the machine itself neither over > > > the loopback interface nor the external interface. It probably > > > suggests badass timing is involved. I hope debug compiled > > > still shows the symptom. > > > > > > I'll be back... > > > > > > > Well, it was not even a clear-cut problem. > > > > It turned out to be a known problem. We ran into it a few months ago > > and the solution then was to ignore the problem i.e workaround in > > the testcases. It was assumed we had found a Linux kernel bug. > > > > We do as supposed. If connect() for a non-blocking socket fails > > with EINPROGRESS we put it in the poll() set and call poll(). > > Later poll() returns with POLLERR|POLLHUP on the socket. > > We call getsockopt(,SOL_SOCKET,SO_ERROR,,) to check if > > the connect succeeded, so far all is as in the manual, but sometimes > > it succeeds but the socket is unusable. All recv() and sendmsg(), > > etc fails. > > > > The symptoms was also not that bad. Any subsequent usage of the > > sockets fails, which a real application will have to be > > prepared to anyway. > > > > But taking a closer lock with strace reveals that we call > > connect() in one thread, poll() in another and getsockopt() > > in a third. Sometimes, and sometimes all in the same thread. > > This task wanders between the schedulers in our SMP VM. > > > > And when the problem starts it seems poll() returns with > > POLLOUT|POLLHUP for the socket before we call connect() > > in another thread, which is temporally impossible. > > I have seen this in one strace and can not reproduce it. > > while strace is running the bug does not show itself. > > > > So, whe have the possibilites: > > 1) A bug of ours where we mess up with the locking > > and loading of data for the poll set. > > 2) A Linux kernel bug in this rare case of tossing > > the task between threads. > > 3) An strace bug for SMP. Its view of the timeline > > is not necessarily correct. > > > > I'll dig further. > > > > I might write a small C program to try to provoke the Linux > > kernel bug, and if it does not provoke it, it is our bug. > > > > Found the bug! > > It was a clear and simple bug in the TCP|UDP|SCTP/IP driver > inet_drv, where making a connect, the poll set was changed > first and then connect was called, and if connect did > not give EINPROGRESS, the poll set was reset again. > > This is the wrong way to do it. The right is to change > the poll set if you have to after getting EINPROGRESS. > > What happened now was: > [thread 1] [thread 2] > > poll( ... > socket() -> Socket > Change pollset data > write(InternalPipe) > to inform poll thread > ...)poll -> internal pipe POLLIN|POLLRDNORM > read(InternalPipe) > connect(Socket, ... > poll() -> Socket POLLOUT|POLLHUP > ... )connect -> EINPROGRESS > getsockopt(Socket, SOL_SOCKET, SO_ERROR) -> > no error > > Note that poll returns with ready for writing on Socket > before connect returns with EINPROGRESS in the other > thread, and Linus only knows why getsockopt() in this > particular case returns no error. > > But we are mishandling the socket, that is for sure. > Try this patch: > *** /clearcase/otp/erts/erts/emulator/drivers/common/inet_drv.c@@/OTP_R12B-4 > 2008-09-01 14:51:18.000000000 +0200 > --- /clearcase/otp/erts/erts/emulator/drivers/common/inet_drv.c 2008-09-15 > 16:23:51.000000000 +0200 > *************** > *** 7239,7257 **** > buf, &len) == NULL) > return ctl_error(EINVAL, rbuf, rsize); > > - sock_select(INETP(desc), FD_CONNECT, 1); > code = sock_connect(desc->inet.s, > (struct sockaddr*) &desc->inet.remote, len); > if ((code == SOCKET_ERROR) && > ((sock_errno() == ERRNO_BLOCK) || /* Winsock2 */ > (sock_errno() == EINPROGRESS))) { /* Unix & OSE!! */ > desc->inet.state = TCP_STATE_CONNECTING; > if (timeout != INET_INFINITY) > driver_set_timer(desc->inet.port, timeout); > enq_async(INETP(desc), tbuf, INET_REQ_CONNECT); > } > else if (code == 0) { /* ok we are connected */ > - sock_select(INETP(desc), FD_CONNECT, 0); > desc->inet.state = TCP_STATE_CONNECTED; > if (desc->inet.active) > sock_select(INETP(desc), (FD_READ|FD_CLOSE), 1); > --- 7239,7256 ---- > buf, &len) == NULL) > return ctl_error(EINVAL, rbuf, rsize); > > code = sock_connect(desc->inet.s, > (struct sockaddr*) &desc->inet.remote, len); > if ((code == SOCKET_ERROR) && > ((sock_errno() == ERRNO_BLOCK) || /* Winsock2 */ > (sock_errno() == EINPROGRESS))) { /* Unix & OSE!! */ > + sock_select(INETP(desc), FD_CONNECT, 1); > desc->inet.state = TCP_STATE_CONNECTING; > if (timeout != INET_INFINITY) > driver_set_timer(desc->inet.port, timeout); > enq_async(INETP(desc), tbuf, INET_REQ_CONNECT); > } > else if (code == 0) { /* ok we are connected */ > desc->inet.state = TCP_STATE_CONNECTED; > if (desc->inet.active) > sock_select(INETP(desc), (FD_READ|FD_CLOSE), 1); > *************** > *** 7259,7265 **** > async_ok(INETP(desc)); > } > else { > - sock_select(INETP(desc), FD_CONNECT, 0); > return ctl_error(sock_errno(), rbuf, rsize); > } > return ctl_reply(INET_REP_OK, tbuf, 2, rbuf, rsize); > --- 7258,7263 ---- > > The patch has not been run through all our regression tests yet. > > -- > > / Raimo Niskanen, Erlang/OTP, Ericsson AB > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From raimo+erlang-bugs@REDACTED Tue Sep 16 09:22:05 2008 From: raimo+erlang-bugs@REDACTED (Raimo Niskanen) Date: Tue, 16 Sep 2008 09:22:05 +0200 Subject: [erlang-bugs] : : : : : Follow up: [BUG] gen_tcp:connect/3, 4 returns socket for closed port In-Reply-To: <6c2563b20809150817t31ec0f00u80522430e234df54@mail.gmail.com> References: <6c2563b20809071223t2cf8b4cat946107d1ea2f8920@mail.gmail.com> <20080908100037.GA1280@erix.ericsson.se> <6c2563b20809080543q6a157b6at2543bba96ffe9cd5@mail.gmail.com> <20080909071243.GA30749@erix.ericsson.se> <6c2563b20809090347s6ee29ebfq7f2c39e476627807@mail.gmail.com> <20080909113207.GA2289@erix.ericsson.se> <20080915075608.GA14206@erix.ericsson.se> <20080915145410.GA20045@erix.ericsson.se> <6c2563b20809150817t31ec0f00u80522430e234df54@mail.gmail.com> Message-ID: <20080916072205.GA3530@erix.ericsson.se> On Mon, Sep 15, 2008 at 11:17:42AM -0400, Edwin Fine wrote: > Raimo, > > Great work! You found it pretty quickly. I'll apply the patch as soon as > possible. > I'll wait for a while before applying the patch to production systems until > the regression tests are done. We found no problems with this patch in the regression tests. Now we will see what you and the other users find... > > Thanks very much. > Regards, > Edwin Fine > > On Mon, Sep 15, 2008 at 10:54 AM, Raimo Niskanen < > raimo+erlang-bugs@REDACTED >wrote: > : > > Found the bug! > > > > It was a clear and simple bug in the TCP|UDP|SCTP/IP driver > > inet_drv, where making a connect, the poll set was changed > > first and then connect was called, and if connect did > > not give EINPROGRESS, the poll set was reset again. > > > > This is the wrong way to do it. The right is to change > > the poll set if you have to after getting EINPROGRESS. > > > > What happened now was: > > [thread 1] [thread 2] > > > > poll( ... > > socket() -> Socket > > Change pollset data > > write(InternalPipe) > > to inform poll thread > > ...)poll -> internal pipe POLLIN|POLLRDNORM > > read(InternalPipe) > > connect(Socket, ... > > poll() -> Socket POLLOUT|POLLHUP > > ... )connect -> EINPROGRESS > > getsockopt(Socket, SOL_SOCKET, SO_ERROR) -> > > no error > > > > Note that poll returns with ready for writing on Socket > > before connect returns with EINPROGRESS in the other > > thread, and Linus only knows why getsockopt() in this > > particular case returns no error. > > > > But we are mishandling the socket, that is for sure. > > Try this patch: > > *** /clearcase/otp/erts/erts/emulator/drivers/common/inet_drv.c@@/OTP_R12B-4 > > 2008-09-01 14:51:18.000000000 +0200 > > --- /clearcase/otp/erts/erts/emulator/drivers/common/inet_drv.c 2008-09-15 > > 16:23:51.000000000 +0200 > > *************** > > *** 7239,7257 **** > > buf, &len) == NULL) > > return ctl_error(EINVAL, rbuf, rsize); > > > > - sock_select(INETP(desc), FD_CONNECT, 1); > > code = sock_connect(desc->inet.s, > > (struct sockaddr*) &desc->inet.remote, len); > > if ((code == SOCKET_ERROR) && > > ((sock_errno() == ERRNO_BLOCK) || /* Winsock2 */ > > (sock_errno() == EINPROGRESS))) { /* Unix & OSE!! */ > > desc->inet.state = TCP_STATE_CONNECTING; > > if (timeout != INET_INFINITY) > > driver_set_timer(desc->inet.port, timeout); > > enq_async(INETP(desc), tbuf, INET_REQ_CONNECT); > > } > > else if (code == 0) { /* ok we are connected */ > > - sock_select(INETP(desc), FD_CONNECT, 0); > > desc->inet.state = TCP_STATE_CONNECTED; > > if (desc->inet.active) > > sock_select(INETP(desc), (FD_READ|FD_CLOSE), 1); > > --- 7239,7256 ---- > > buf, &len) == NULL) > > return ctl_error(EINVAL, rbuf, rsize); > > > > code = sock_connect(desc->inet.s, > > (struct sockaddr*) &desc->inet.remote, len); > > if ((code == SOCKET_ERROR) && > > ((sock_errno() == ERRNO_BLOCK) || /* Winsock2 */ > > (sock_errno() == EINPROGRESS))) { /* Unix & OSE!! */ > > + sock_select(INETP(desc), FD_CONNECT, 1); > > desc->inet.state = TCP_STATE_CONNECTING; > > if (timeout != INET_INFINITY) > > driver_set_timer(desc->inet.port, timeout); > > enq_async(INETP(desc), tbuf, INET_REQ_CONNECT); > > } > > else if (code == 0) { /* ok we are connected */ > > desc->inet.state = TCP_STATE_CONNECTED; > > if (desc->inet.active) > > sock_select(INETP(desc), (FD_READ|FD_CLOSE), 1); > > *************** > > *** 7259,7265 **** > > async_ok(INETP(desc)); > > } > > else { > > - sock_select(INETP(desc), FD_CONNECT, 0); > > return ctl_error(sock_errno(), rbuf, rsize); > > } > > return ctl_reply(INET_REP_OK, tbuf, 2, rbuf, rsize); > > --- 7258,7263 ---- > > > > The patch has not been run through all our regression tests yet. > > > > -- > > > > / Raimo Niskanen, Erlang/OTP, Ericsson AB > > > > -- / Raimo Niskanen, Erlang/OTP, Ericsson AB From mog-lists@REDACTED Tue Sep 16 17:25:37 2008 From: mog-lists@REDACTED (mog) Date: Tue, 16 Sep 2008 10:25:37 -0500 Subject: [erlang-bugs] rsa upgrades for crypto In-Reply-To: <48CF939D.3010809@erix.ericsson.se> References: <20080911105451.GA13742@metalman.digium.internal> <48C8FFE2.5040902@erix.ericsson.se> <20080911135903.GA4420@metalman.digium.internal> <48CF939D.3010809@erix.ericsson.se> Message-ID: <20080916152537.GA6956@metalman.digium.internal> On Tue, Sep 16, 2008 at 01:08:13PM +0200, Dan Gudmundsson wrote: > Hi > > This is what I currently got is it good enough? > Works great on my box and for what I am using it for, I can sign and verify without any problems. Thank you so much for taking time to clean up and apply patch. Mog -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 197 bytes Desc: Digital signature URL: From matthew@REDACTED Wed Sep 17 01:43:54 2008 From: matthew@REDACTED (Matthew Dempsky) Date: Tue, 16 Sep 2008 16:43:54 -0700 Subject: [erlang-bugs] net_kernel waiting for response from self Message-ID: Earlier we noticed one of our Erlang nodes was not connected to a bunch of other nodes, and that issuing net_adm:ping(Node) to try to reconnect it would just hang. It seems that net_kernel had issued a gen_server:call to itself and was waiting indefinitely for a response. Complete backtrace of Pid (net_kernel, <0.22.0>) is below. From the back trace you can see net_kernel made a call to gen_server:call(net_kernel, {connect, normal, 'mochisvn@REDACTED'}, infinity) and sent the request with #Ref<0.0.622.220265>. To confirm this, you can look at the process_info dump and see the second element in the message queue is {'$gen_call',{<0.22.0>,#Ref<0.0.622.220265>}, {connect,normal,'mochisvn@REDACTED'}}. I forged a response by grabbing that references and calling Pid ! {Ref, false}, which finally led net_kernel to proceed a little bit further, but it immediately hung trying to handle {'$gen_call',{<6341.14452.446>,#Ref<6341.0.542.13516>}, {is_auth,'discover@REDACTED'}}. I repeated the process a few more times, forging responses to the connect calls, but eventually gave up and just restarted the node after determining there were 300,000 pending is_auth requests in the message queue, and each needed a forged response to continue. (mochiadsdb@REDACTED)88> bt(Pid). Program counter: 0x00002aaaab841870 (gen:wait_resp_mon/3 + 64) CP: 0x00002aaaab89dc30 (gen_server:call/3 + 160) arity = 0 0x00002aaae3591870 Return addr 0x00002aaaab89dc30 (gen_server:call/3 + 160) y(0) infinity y(1) #Ref<0.0.622.220265> y(2) 'mochiadsdb@REDACTED' 0x00002aaae3591890 Return addr 0x00002aaaab812d88 (erlang:dsend/2 + 136) y(0) infinity y(1) {connect,normal,'mochisvn@REDACTED'} y(2) net_kernel y(3) Catch 0x00002aaaab89dc30 (gen_server:call/3 + 160) 0x00002aaae35918b8 Return addr 0x00002aaaab89e160 (gen_server:reply/2 + 208) y(0) {#Ref<6397.0.34.240428>,yes} y(1) <6397.6917.55> 0x00002aaae35918d0 Return addr 0x00002aaaab8a2370 (gen_server:handle_msg/5 + 848) y(0) Catch 0x00002aaaab89e160 (gen_server:reply/2 + 208) 0x00002aaae35918e0 Return addr 0x00002aaaab853310 (proc_lib:init_p/5 + 400) y(0) net_kernel y(1) [] y(2) net_kernel y(3) <0.19.0> y(4) [] y(5) [] y(6) {state,'mochiadsdb@REDACTED','mochiadsdb@REDACTED',longnames,{tick,<0.24.0>,15000},7000,sys_dist,[{<0.27218.334>,'mochibot@REDACTED'},{<0.5652.334>,'mochimonitor_ping@REDACTED'},{<0.30534.333>,'mochimond@REDACTED'},{<0.14584.332>,'mochibot@REDACTED'},{<0.10693.325>,'discover@REDACTED'},{<0.10679.325>,'mochimond@REDACTED'},{<0.8624.325>,'mochimond@REDACTED'},{<0.21639.321>,'discover@REDACTED'},{<0.21631.321>,'mochimond@REDACTED'},{<0.1479.310>,'mochimond@REDACTED'},{<0.1474.310>,'discover@REDACTED'},{<0.1794.308>,'mochimond@REDACTED'},{<0.14155.304>,'mochisvn@REDACTED'},{<0.8435.304>,'mochimond@REDACTED'},{<0.11959.295>,'mochimond@REDACTED'},{<0.11954.295>,'discover@REDACTED'},{<0.11952.295>,'mochisvn@REDACTED'},{<0.12655.294>,'mochimonitor_relay@REDACTED'},{<0.1597.285>,'mochicrypt@REDACTED'},{<0.31519.268>,'mochimond@REDACTED'},{<0.31492.268>,'mochiads_logger@REDACTED'},{<0.31418.268>,'mochimq2@REDACTED'},{<0.28344.266>,'mochimond@REDACTED'},{<0.28341.266>,'discover@REDACTED'},{<0.9573.266>,'mochisvn@REDACTED'},{<0.28064.264>,'mochimonitor_httpcheck@REDACTED'},{<0.24678.264>,'mochibot@REDACTED'},{<0.25804.263>,'mochimq2@REDACTED'},{<0.4231.260>,'mochiscore@REDACTED'},{<0.1352.260>,'admapper@REDACTED'},{<0.1348.260>,'mochiscore@REDACTED'},{<0.1346.260>,'mochiads_loggerv2@REDACTED'},{<0.32206.259>,'discover@REDACTED'},{<0.32169.259>,'discover@REDACTED'},{<0.20340.259>,'mochimond@REDACTED'},{<0.19993.259>,'mochimond@REDACTED'},{<0.16961.250>,'discover@REDACTED'},{<0.6399.250>,'discover@REDACTED'},{<0.6370.250>,'mochipass@REDACTED'},{<0.6286.250>,'discover@REDACTED'},{<0.6278.250>,'mochipass@REDACTED'},{<0.22656.248>,'mochimond@REDACTED'},{<0.22654.248>,'mochimond@REDACTED'},{<0.22653.248>,'mochimond@REDACTED'},{<0.22625.248>,'mochisvn@REDACTED'},{<0.22623.248>,'mochisvn@REDACTED'},{<0.22621.248>,'mochisvn@REDACTED'},{<0.22619.248>,'mochisvn@REDACTED'},{<0.22603.248>,'mochisvn@REDACTED'},{<0.22593.248>,'mochisvn@REDACTED'},{<0.22588.248>,'mochisvn@REDACTED'},{<0.22559.248>,'mochisvn@REDACTED'},{<0.22554.248>,'mochisvn@REDACTED'},{<0.22546.248>,'mochisvn@REDACTED'},{<0.22542.248>,'mochisvn@REDACTED'},{<0.22538.248>,'mochisvn@REDACTED'},{<0.22524.248>,'mochisvn@REDACTED'},{<0.22519.248>,'mochisvn@REDACTED'},{<0.22515.248>,'mochisvn@REDACTED'},{<0.22511.248>,'mochisvn@REDACTED'},{<0.22507.248>,'mochisvn@REDACTED'},{<0.22487.248>,'mochisvn@REDACTED'},{<0.22474.248>,'mochisvn@REDACTED'},{<0.22469.248>,'mochisvn@REDACTED'},{<0.19636.245>,'discover@REDACTED'},{<0.18890.245>,'mochimond@REDACTED'},{<0.18885.245>,'mochimond@REDACTED'},{<0.18856.245>,'discover@REDACTED'},{<0.18850.245>,'mochiads@REDACTED'},{<0.18830.245>,'discover@REDACTED'},{<0.18824.245>,'mochiads@REDACTED'},{<0.18818.245>,'discover@REDACTED'},{<0.18814.245>,'mochiads@REDACTED'},{<0.18800.245>,'discover@REDACTED'},{<0.18791.245>,'mochimond@REDACTED'},{<0.18789.245>,'mochiads@REDACTED'},{<0.18786.245>,'mochiads@REDACTED'},{<0.18721.245>,'mochimond@REDACTED'},{<0.18711.245>,'mochiads_logger@REDACTED'},{<0.18575.245>,'mochiads_logger@REDACTED'},{<0.18556.245>,'mochimonitor_alert@REDACTED'},{<0.18535.245>,'mochimond@REDACTED'},{<0.18423.245>,'mochiads_logger@REDACTED'},{<0.12300.159>,'mochimond@REDACTED'},{<0.23778.120>,'mochimond@REDACTED'},{<0.3295.120>,'juanita@REDACTED'},{<0.23280.101>,'mochiscore@REDACTED'},{<0.23276.101>,'mochiscore@REDACTED'},{<0.3402.67>,'discover@REDACTED'},{<0.3398.67>,'mochimond@REDACTED'},{<0.3339.67>,'discover@REDACTED'},{<0.3335.67>,'mochimond@REDACTED'},{<0.3328.67>,'discover@REDACTED'},{<0.8774.2>,'mochiads@REDACTED'},{<0.673.0>,'mochimonitor_alert@REDACTED'},{<0.679.0>,'mochimonitor_dbweb@REDACTED'},{<0.612.0>,'mochipass@REDACTED'},{<0.350.0>,'discover@REDACTED'},{<0.342.0>,'discover@REDACTED'},{<0.344.0>,'mochimond@REDACTED'},{<0.333.0>,'mochimond@REDACTED'},{<0.288.0>,'discover@REDACTED'},{<0.227.0>,'discover@REDACTED'},{<0.77.0>,'discover@REDACTED'},{<0.183.0>,'discover@REDACTED'},{<0.117.0>,'mochimond@REDACTED'},{<0.103.0>,'discover@REDACTED'},{<0.98.0>,'discover@REDACTED'},{<0.91.0>,'mochimond@REDACTED'},{<0.84.0>,'mochimond@REDACTED'},{<0.75.0>,'mochimond@REDACTED'},{<0.68.0>,'discover@REDACTED'}],[],[{listen,#Port<0.7>,<0.23.0>,{net_address,{{0,0,0,0},39758},"reaver",tcp,inet},inet_tcp_dist}],[],0,all} 0x00002aaae3591920 Return addr 0x00000000008572f8 () y(0) Catch 0x00002aaaab853330 (proc_lib:init_p/5 + 432) y(1) gen y(2) init_it y(3) [gen_server,<0.19.0>,<0.19.0>,{local,net_kernel},net_kernel,{'mochiadsdb@REDACTED',longnames,15000},[]] ok (mochiadsdb@REDACTED)89> erlang:process_info(Pid). [{registered_name,net_kernel}, {current_function,{gen,wait_resp_mon,3}}, {initial_call,{proc_lib,init_p,5}}, {status,waiting}, {message_queue_len,458946}, {messages,[{'EXIT',<0.22619.248>,connection_closed}, {'$gen_call',{<0.22.0>,#Ref<0.0.622.220265>}, {connect,normal,'mochisvn@REDACTED'}}, {'$gen_call',{<6341.14452.446>,#Ref<6341.0.542.13516>}, {is_auth,'discover@REDACTED'}}, tick,tick,tick,tick, {'EXIT',<0.1348.260>,connection_closed}, {'EXIT',<0.679.0>,connection_closed}, {'EXIT',<0.6370.250>,connection_closed}, {'EXIT',<0.18818.245>,connection_closed}, {'EXIT',<0.22656.248>,connection_closed}, {'EXIT',<0.21639.321>,connection_closed}, {'EXIT',<0.612.0>,connection_closed}, {'EXIT',<0.12655.294>,connection_closed}, {'EXIT',<0.23280.101>,connection_closed}, {'EXIT',<0.183.0>,connection_closed}, {'EXIT',<0.673.0>,connection_closed}, {'EXIT',<0.18711.245>,...}, {'EXIT',...}, {...}|...]}, {links,[<0.12300.159>,<0.22603.248>,<0.11959.295>, <0.1474.310>,<0.1479.310>,<0.30534.333>,<0.14155.304>, <0.1794.308>,<0.8435.304>,<0.32206.259>,<0.11952.295>, <0.11954.295>,<0.9573.266>,<0.1597.285>,<0.20340.259>, <0.32169.259>,<0.19993.259>,<0.22511.248>,<0.22554.248>, <0.22588.248>|...]}, {dictionary,[{'$ancestors',[net_sup,kernel_sup,<0.9.0>]}, {longnames,true}, {'$initial_call',{gen,init_it, [gen_server,<0.19.0>,<0.19.0>, {local,net_kernel}, net_kernel, {'mochiadsdb@REDACTED',longnames,15000}, []]}}]}, {trap_exit,true}, {error_handler,error_handler}, {priority,max}, {group_leader,<0.8.0>}, {total_heap_size,8024355}, {heap_size,8024355}, {stack_size,27}, {reductions,933092852}, {garbage_collection,[{fullsweep_after,65535},{minor_gcs,0}]}, {suspending,[]}] From ulf.wiger@REDACTED Wed Sep 17 09:44:01 2008 From: ulf.wiger@REDACTED (Ulf Wiger (TN/EAB)) Date: Wed, 17 Sep 2008 09:44:01 +0200 Subject: [erlang-bugs] common test missing install.sh Message-ID: <48D0B541.7090905@ericsson.com> In the Common Test User Guide (http://www.erlang.org/doc/apps/common_test/install_chapter.html#2) the installation instruction says to run the install.sh script. But there is no such script in common_test-1.3.2, or in any of its subdirectories. The best solution would of course be if common_test didn't need to be copied out of the OTP tree, and was able to use test suites located in some directory given at execution time. If this is the case, the documentation should be updated. BR, Ulf W From kenji.rikitake@REDACTED Thu Sep 18 04:16:02 2008 From: kenji.rikitake@REDACTED (Kenji Rikitake) Date: Thu, 18 Sep 2008 11:16:02 +0900 Subject: [erlang-bugs] leap-second-enabled FreeBSD doesn't work right with R12B4 erts/emulator/beam/erl_time_sup.c; correction patch included In-Reply-To: <20080914052418.GA10612@k2r.org> References: <20080914014111.GA7026@k2r.org> <20080914031210.GA9162@k2r.org> <20080914052418.GA10612@k2r.org> Message-ID: <20080918021602.GA6530@k2r.org> The attached patch includes an additional BIF erlang:now_utc/0 to solve the problem, as well as the previous fix for erlang:universaltime_to_localtime/1. Adding a BIF is surely experimental, but I think this is the easiest way to solve this issue. Changing erlang:now/0 semantics is quite risky. I'm still wondering if this is the best way to deal with the systems with leap-second-enabled wall clocks (and OSes). I'd appreciate if I can read any comment from Erlang/OTP Team on this issue. Regards, Kenji Rikitake In the message <20080914052418.GA10612@REDACTED> dated Sun, Sep 14, 2008 at 02:23:55PM +0900, Kenji Rikitake writes: > I've also found out that the following three functions > > calendar:now_to_datetime/1 > calendar:now_to_universal_time/1 (equivalent to calendar:now_to_datetime/1) > calendar:now_to_local_time/1 > > do not work in the leap-second-enabled environment, due to the fact that > erlang:now/0 shows the internal clock value as is, with gettimeofday(2). > The converted results of the functions include the offset of > ((TAI-UTC) - 10) > seconds. > > Modifying erlang:now/0, defined as get_now() in erl_time_sup.c to > include the offset of time2posix(3) is a possible solution, though I > don't feel like to doing it because it will surely break the assumption > of continuous monotonous increasing of erlang:now/0. > > Fixing only the calendar module functions by adding a time2posix(3) > calculation routine written in C somewhere in the BEAM BIFs looks better > to me, though I need to investigate further. -------------- next part -------------- A non-text attachment was scrubbed... Name: erl_now_utc.patch Type: text/x-diff Size: 6373 bytes Desc: not available URL: From michael@REDACTED Tue Sep 23 20:01:06 2008 From: michael@REDACTED (Michael Mullis) Date: Tue, 23 Sep 2008 14:01:06 -0400 Subject: [erlang-bugs] re module doc change Message-ID: Hi. I hope this is the correct venue for documentation bugs so please forgive the intrusion if not. In R12B-4, the re module doc needs to state clearly up front that regular expressions normally containing a backslash (\) such as in the PCRE examples, need to escape the backslash (\\) when used in Erlang programs. For example, re:run("foo", "\\bfoo\\b") instead of re:run("foo", "\bfoo\b") There's a small blurb just before "PERL LIKE REGULAR EXPRESSIONS SYNTAX" section but it's part of the "replace" function doc and not clear enough to indicate that ALL backslashes intended to be part of a regexp need to be escaped in order pass through to the regular expression engine. thanks, michael mullis. -------------- next part -------------- An HTML attachment was scrubbed... URL: From bengt.kleberg@REDACTED Wed Sep 24 07:51:58 2008 From: bengt.kleberg@REDACTED (Bengt Kleberg) Date: Wed, 24 Sep 2008 07:51:58 +0200 Subject: [erlang-bugs] Very minor documentation bug in proplists module Message-ID: <1222235518.7604.36.camel@seasc0642.dyn.rnd.as.sw.ericsson.se> Greetings, On http://erlang.org/doc/man/proplists.html the "See also" for get_value/3 refers to a function get_value/1. It does not exist. Also, I think it would be slightly better to refer to get_value/2 instead of lookup/2 in the explanation to get_value/3. bengt From richardc@REDACTED Wed Sep 24 10:49:01 2008 From: richardc@REDACTED (Richard Carlsson) Date: Wed, 24 Sep 2008 10:49:01 +0200 Subject: [erlang-bugs] Very minor documentation bug in proplists module In-Reply-To: <1222235518.7604.36.camel@seasc0642.dyn.rnd.as.sw.ericsson.se> References: <1222235518.7604.36.camel@seasc0642.dyn.rnd.as.sw.ericsson.se> Message-ID: <48D9FEFD.8070207@it.uu.se> Bengt Kleberg wrote: > Greetings, > > On http://erlang.org/doc/man/proplists.html the "See also" for > get_value/3 refers to a function get_value/1. It does not exist. Thanks. It should be get_value/2. > Also, I think it would be slightly better to refer to get_value/2 > instead of lookup/2 in the explanation to get_value/3. Well, no, because lookup/2 is the main primitive, while get_value/2 is just a simplified form of get_value/3 itself. /Richard From bengt.kleberg@REDACTED Wed Sep 24 10:57:23 2008 From: bengt.kleberg@REDACTED (Bengt Kleberg) Date: Wed, 24 Sep 2008 10:57:23 +0200 Subject: [erlang-bugs] Very minor documentation bug in proplists module In-Reply-To: <48D9FEFD.8070207@it.uu.se> References: <1222235518.7604.36.camel@seasc0642.dyn.rnd.as.sw.ericsson.se> <48D9FEFD.8070207@it.uu.se> Message-ID: <1222246644.7604.48.camel@seasc0642.dyn.rnd.as.sw.ericsson.se> OK, fine by me. The proplists module is (IMHO) somewhat byzantine and I did not understand that lookup/2 is the main primitive. bengt On Wed, 2008-09-24 at 10:49 +0200, Richard Carlsson wrote: > Bengt Kleberg wrote: > > Greetings, > > > > On http://erlang.org/doc/man/proplists.html the "See also" for > > get_value/3 refers to a function get_value/1. It does not exist. > > Thanks. It should be get_value/2. > > > Also, I think it would be slightly better to refer to get_value/2 > > instead of lookup/2 in the explanation to get_value/3. > > Well, no, because lookup/2 is the main primitive, while get_value/2 > is just a simplified form of get_value/3 itself. > > /Richard From richardc@REDACTED Wed Sep 24 11:33:09 2008 From: richardc@REDACTED (Richard Carlsson) Date: Wed, 24 Sep 2008 11:33:09 +0200 Subject: [erlang-bugs] Very minor documentation bug in proplists module In-Reply-To: <1222246644.7604.48.camel@seasc0642.dyn.rnd.as.sw.ericsson.se> References: <1222235518.7604.36.camel@seasc0642.dyn.rnd.as.sw.ericsson.se> <48D9FEFD.8070207@it.uu.se> <1222246644.7604.48.camel@seasc0642.dyn.rnd.as.sw.ericsson.se> Message-ID: <48DA0955.6090307@it.uu.se> Bengt Kleberg wrote: > The proplists module is (IMHO) somewhat byzantine and I did not > understand that lookup/2 is the main primitive. Oh, it's not all that bad (now that I look at it again), but it could do with a lot more examples. In particular, it's not obvious what all the expansion/substitution stuff is good for. But it is; here is some example code from the hipe module (simplified a bit): ---- expand_options(Opts) -> proplists:normalize(Opts, [{negations, opt_negations()}, {aliases, opt_aliases()}, {expand, opt_basic_expansions()}, {expand, opt_expansions()}]). opt_negations() -> [{no_binary_opt, binary_opt}, {no_bitlevel_binaries, bitlevel_binaries}, {no_debug, debug}, ... {no_use_indexing, use_indexing}]. opt_aliases() -> [{'O0', o0}, {'O1', o1}, {'O2', o2}, {'O3', o3}]. opt_basic_expansions() -> [{pp_all, [pp_beam, pp_icode, pp_rtl, pp_native]}]. opt_expansions() -> [{o1, o1_opts()}, {o2, o2_opts()}, {o3, o3_opts()}, {x87, [x87, inline_fp]}]. ---- /Richard From bgustavsson@REDACTED Wed Sep 24 14:44:36 2008 From: bgustavsson@REDACTED (Bjorn Gustavsson) Date: Wed, 24 Sep 2008 14:44:36 +0200 Subject: [erlang-bugs] Missing documentation In-Reply-To: <95be1d3b0809050437r25307776h93928c8d387e1bde@mail.gmail.com> References: <95be1d3b0809050437r25307776h93928c8d387e1bde@mail.gmail.com> Message-ID: <6672d0160809240544x453dff3ar50a87f45fd636356@mail.gmail.com> On Fri, Sep 5, 2008 at 1:37 PM, Vlad Dumitrescu wrote: > Hi! > > The R12 documentation is missing documentation for some functions. The > ones I noticed are code:is_sticky/1 and is_module_native/1. > They will be documented in R12B-5. /Bjorn -- Bj?rn Gustavsson, Erlang/OTP, Ericsson AB -------------- next part -------------- An HTML attachment was scrubbed... URL: From bgustavsson@REDACTED Wed Sep 24 15:45:44 2008 From: bgustavsson@REDACTED (Bjorn Gustavsson) Date: Wed, 24 Sep 2008 15:45:44 +0200 Subject: [erlang-bugs] Dynamic libraries are not closed on MacOS X [with patch] In-Reply-To: References: Message-ID: <6672d0160809240645y638d764ehc2ea80bc3ab1153a@mail.gmail.com> 2008/9/7 Paul Guyot > Hello, > > On MacOS X, dynamic libraries opened with erl_ddll:load or load_driver are > not closed when erl_ddll:unload/unload_driver is called. > > Steps to reproduce: > * build a simple dynamic library, called simple_drv.so > * start a new erlang shell in the same directory. > * note down the PID. > * evaluate erl_ddll:load(".", "simple_drv"). > * run lsof -p to see that indeed simple_drv.so file is open. > * evaluate erl_ddll:unload("simple_drv"). > * run lsof -p to see that simple_drv.so file is still open. > > This is specific to MacOS X and this is simply because the code hasn't been > written. On other Unix implementations, the erl_ddll code calls dlopen and > dclose. > > The attached patch against R12B-4 fixes the bug and was tested on MacOS X > 10.4/ppc. > Thanks for your patch. It works fine, but we will use the simpler solution of using dlopen() on all platforms: http://www.erlang.org/pipermail/erlang-patches/2008-September/000293.html /Bjorn -- Bj?rn Gustavsson, Erlang/OTP, Ericsson AB -------------- next part -------------- An HTML attachment was scrubbed... URL: From hio@REDACTED Thu Sep 25 07:00:38 2008 From: hio@REDACTED (YAMASHINA Hio) Date: Thu, 25 Sep 2008 14:00:38 +0900 Subject: [erlang-bugs] ssh module bugfixes Message-ID: <20080925140038.022ab3d7.hio@hio.jp> Hi. There is fix-bugs patch. This containts miscellaneous bugs (typo and erroneous parameters). -- YAMASHINA Hio , -------------- next part -------------- A non-text attachment was scrubbed... Name: ssh-1.0.fix-bugs.patch Type: application/octet-stream Size: 1756 bytes Desc: not available URL: From hio@REDACTED Thu Sep 25 07:00:42 2008 From: hio@REDACTED (YAMASHINA Hio) Date: Thu, 25 Sep 2008 14:00:42 +0900 Subject: [erlang-bugs] ssh module v6 connect problem Message-ID: <20080925140042.6b806c73.hio@hio.jp> Hi. I found a bug around connecting to v6 host. Host = "localhost", Port = 22, Ret2 = ssh:connect(Host, Port, [ {user_dir, "xxx" }, {user, "xxx" }, {password, "xxx" }, {silently_accept_hosts, true} ]), Above invocation on ip-v6 enabled box will be:: Callback:connect(Address, Port, [{active, false} | SocketOpts], Timeout) where Address = {0,0,0,0, 0,0,0,1}, SocketOpts = [], Callback = gen_tcp This will fail with {error,nxdomain}. Because SocketOpts does not contain inet6 option. This problem should be fixed in gen_tcp module? -- YAMASHINA Hio , -------------- next part -------------- A non-text attachment was scrubbed... Name: ssh-1.0.fix-connectopt.patch Type: application/octet-stream Size: 1358 bytes Desc: not available URL: From hio@REDACTED Thu Sep 25 07:00:46 2008 From: hio@REDACTED (YAMASHINA Hio) Date: Thu, 25 Sep 2008 14:00:46 +0900 Subject: [erlang-bugs] ssh module, enhancement proposal for custom ssh_cli. Message-ID: <20080925140046.7e36e6e6.hio@hio.jp> Hi. I wanted to write customized ssh daemon application. But It looks like no way to trap ssh-packets until shell is started. (and no way to get ssh_cm instance from shell function.) I wrote a patch for writing original ssh-daemon application by replace ssh_cli with another module which can be selected by caller. % ssh_cli option: % - module name which implements child_spec/4 % - or fun/4 % - default is 'ssh_cli' % these functions should return % - fun/0 closure which returns child_pid. % - or just a child_spec. ssh:daemon(22, [ {ssh_cli, ?MODULE }, {system_dir, "xxx"}, {pwdfun, fun pwdfun/2} ]). child_spec(_Shell, _Address, _Port, _Options) -> ChildSpecFun = fun() -> ChildPid = spawn(fun() -> ?MODULE:init() end), ChildPid end, ChildSpecFun. Like previous ssh_cm:listen. ssh_cm:listen( fun() -> spawn_link(?MODULE, init, []) end, 22, [ {system_dir, "."}, {pwdfun, fun check_auth/2} ] ). Thank you. -- YAMASHINA Hio , -------------- next part -------------- A non-text attachment was scrubbed... Name: ssh-1.0.custom-ssh_cli.patch Type: application/octet-stream Size: 912 bytes Desc: not available URL: From raimo+erlang-bugs@REDACTED Thu Sep 25 11:39:38 2008 From: raimo+erlang-bugs@REDACTED (Raimo Niskanen) Date: Thu, 25 Sep 2008 11:39:38 +0200 Subject: [erlang-bugs] bug: Protocol: "inet_tcp": register/listen error: eaddrinuse while starting a node In-Reply-To: <871vzyfn2m.fsf@yandex-team.ru> References: <871vzyfn2m.fsf@yandex-team.ru> Message-ID: <20080925093938.GA13424@erix.ericsson.se> On Fri, Sep 05, 2008 at 10:46:09PM +0400, Igor Goryachev wrote: > Hello everybody. > > It's me again. I've posted this message some time ago, but have not got > any response. I confirm that this bug is reproduced (seems to be > floating bug) when using R12B-3 version and '-kernel inet_dist_listen_min > XXXX inet_dist_listen_max YYYY' when XXXX number equal to YYYY. Hi Igor. Yes why not. We see no harm in your patch and will apply it. It will be part of the next service release. Sorry about the slow response to your first mail. > > > > To: erlang-questions@REDACTED > From: Igor Goryachev > Date: Tue, 05 Feb 2008 18:00:30 +0300 > User-Agent: Gnus/5.11 (Gnus v5.11) Emacs/22.1 (gnu/linux) > Cc: erlang-bugs@REDACTED > Subject: [erlang-bugs] bug: Protocol: "inet_tcp": register/listen error: > eaddrinuse while starting a node > Xref: goryachev.yandex.ru lists.erlang-questions:11465 lists.erlang-bugs:456 > > Hi, everyone! > > I have noticed some troubles while starting node with '-kernel > inet_dist_listen_min 5290 inet_dist_listen_max 5290' options. And > everything works fine when these options are omitted. > > The output is here: > > =========== > {progress,preloaded} > {progress,kernel_load_completed} > {progress,modules_loaded} > {start,heart} > {start,error_logger} > {start,application_controller} > {progress,init_kernel_started} > {apply,{application,load,[{application,stdlib,[{description,"ERTS CXC 138 10"},{vsn,"1.14.5"},{id,[]},{modules,[base64,beam_lib,c,calendar,dets,dets_server,dets_sup,dets_utils,dets_v8,dets_v9,dict,digraph,digraph_utils,edlin,edlin_expand,epp,eval_bits,erl_bits,erl_compile,erl_eval,erl_expand_records,erl_internal,erl_lint,erl_parse,erl_posix_msg,erl_pp,erl_scan,erl_tar,error_logger_file_h,error_logger_tty_h,escript,ets,file_sorter,filelib,filename,gb_trees,gb_sets,gen,gen_event,gen_fsm,gen_server,io,io_lib,io_lib_format,io_lib_fread,io_lib_pretty,lib,lists,log_mf_h,math,ms_transform,orddict,ordsets,otp_internal,pg,pool,proc_lib,proplists,qlc,qlc_pt,queue,random,regexp,sets,shell,shell_default,slave,sofs,string,supervisor,supervisor_bridge,sys,timer,win32reg,zip]},{registered,[timer_server,rsh_starter,take_over_monitor,pool_master,dets]},{applications,[kernel]},{included_applications,[]},{env,[]},{start_phases,undefined},{maxT,infinity},{maxP,infinity}]}]}} > {progress,applications_loaded} > {apply,{application,start_boot,[kernel,permanent]}} > {error_logger,{{2008,2,5},{17,18,42}},"Protocol: ~p: register/listen error: ~p~n",["inet_tcp",eaddrinuse]} > {error_logger,{{2008,2,5},{17,18,42}},crash_report,[[{pid,<0.21.0>},{registered_name,net_kernel},{error_info,{error,badarg}},{initial_call,{gen,init_it,[gen_server,<0.18.0>,<0.18.0>,{local,net_kernel},net_kernel,{'ejabberd@REDACTED',shortnames,15000},[]]}},{ancestors,[net_sup,kernel_sup,<0.8.0>]},{messages,[]},{links,[<0.18.0>]},{dictionary,[{longnames,false}]},{trap_exit,true},{status,running},{heap_size,377},{stack_size,21},{reductions,331}],[]]} > {error_logger,{{2008,2,5},{17,18,42}},supervisor_report,[{supervisor,{local,net_sup}},{errorContext,start_error},{reason,{'EXIT',nodistribution}},{offender,[{pid,undefined},{name,net_kernel},{mfa,{net_kernel,start_link,[['ejabberd@REDACTED',shortnames]]}},{restart_type,permanent},{shutdown,2000},{child_type,worker}]}]} > {error_logger,{{2008,2,5},{17,18,42}},supervisor_report,[{supervisor,{local,kernel_sup}},{errorContext,start_error},{reason,shutdown},{offender,[{pid,undefined},{name,net_sup},{mfa,{erl_distribution,start_link,[]}},{restart_type,permanent},{shutdown,infinity},{child_type,supervisor}]}]} > {error_logger,{{2008,2,5},{17,18,42}},crash_report,[[{pid,<0.7.0>},{registered_name,[]},{error_info,{shutdown,{kernel,start,[normal,[]]}}},{initial_call,{application_master,init,[<0.5.0>,<0.6.0>,{appl_data,kernel,[application_controller,erl_reply,auth,boot_server,code_server,disk_log_server,disk_log_sup,erl_prim_loader,error_logger,file_server_2,fixtable_server,global_group,global_name_server,heart,init,kernel_config,kernel_sup,net_kernel,net_sup,rex,user,os_server,ddll_server,erl_epmd,inet_db,pg2],undefined,{kernel,[]},[application,application_controller,application_master,application_starter,auth,code,code_aux,packages,code_server,dist_util,erl_boot_server,erl_distribution,erl_prim_loader,erl_reply,erlang,error_handler,error_logger,file,file_server,file_io_server,prim_file,global,global_group,global_search,group,heart,hipe_unified_loader,inet6_tcp,inet6_tcp_dist,inet6_udp,inet_config,inet_hosts,inet_gethost_native,inet_tcp_dist,init,kernel,kernel_config,net,net_adm,net_kernel,os,ram_file,rpc,user,user_drv,user_sup,disk_log,disk_log_1,disk_log_server,disk_log_sup,dist_ac,erl_ddll,erl_epmd,erts_debug,gen_tcp,gen_udp,gen_sctp,prim_inet,inet,inet_db,inet_dns,inet_parse,inet_res,inet_tcp,inet_udp,inet_sctp,pg2,seq_trace,wrap_log_reader,zlib,otp_ring0],[],infinity,infinity},normal]}},{ancestors,[<0.6.0>]},{messages,[{'EXIT',<0.8.0>,normal}]},{links,[<0.6.0>,<0.5.0>]},{dictionary,[]},{trap_exit,true},{status,running},{heap_size,987},{stack_size,21},{reductions,2063}],[]]} > {apply,{application,start_boot,[stdlib,permanent]}} > {error_logger,{{2008,2,5},{17,18,42}},std_info,[{application,kernel},{exited,{shutdown,{kernel,start,[normal,[]]}}},{type,permanent}]} > {"Kernel pid terminated",application_controller,"{application_start_failure,kernel,{shutdown,{kernel,start,[normal,[]]}}}"} > > Crash dump was written to: erl_crash.dump > Kernel pid terminated (application_controller) ({application_start_failure,kernel,{shutdown,{kernel,start,[normal,[]]}}}) > =========== > > > After a little bit of investigation I've noticed, that there is no > '{reuseaddr, true}' socket option in the lib/kernel/src/inet_tcp_dist.erl. If > I add this option, recompile inet_tcp_dist.erl, put it into proper > place, everything works fine like when mentioned above kernel options > are omitted. > > The tiny patch is included. > > > > -- > Igor Goryachev > Yandex development team. > > --- inet_tcp_dist.erl.orig 2008-02-05 17:28:51.000000000 +0300 > +++ inet_tcp_dist.erl 2008-02-05 17:31:24.000000000 +0300 > @@ -62,7 +62,7 @@ > %% ------------------------------------------------------------ > > listen(Name) -> > - case do_listen([{active, false}, {packet,2}]) of > + case do_listen([{active, false}, {packet,2}, {reuseaddr, true}]) of > {ok, Socket} -> > TcpAddress = get_tcp_address(Socket), > {_,Port} = TcpAddress#net_address.address, > _______________________________________________ > erlang-bugs mailing list > erlang-bugs@REDACTED > http://www.erlang.org/mailman/listinfo/erlang-bugs > > > -- > Igor Goryachev > Yandex development team. > _______________________________________________ > erlang-bugs mailing list > erlang-bugs@REDACTED > http://www.erlang.org/mailman/listinfo/erlang-bugs -- / Raimo Niskanen, Erlang/OTP, Ericsson AB From raimo+erlang-bugs@REDACTED Thu Sep 25 15:27:12 2008 From: raimo+erlang-bugs@REDACTED (Raimo Niskanen) Date: Thu, 25 Sep 2008 15:27:12 +0200 Subject: [erlang-bugs] : bug: Protocol: "inet_tcp": register/listen error: eaddrinuse while starting a node In-Reply-To: <87y71gmnka.fsf@yandex-team.ru> References: <871vzyfn2m.fsf@yandex-team.ru> <20080925093938.GA13424@erix.ericsson.se> <87y71gmnka.fsf@yandex-team.ru> Message-ID: <20080925132712.GA5481@erix.ericsson.se> On Thu, Sep 25, 2008 at 04:12:37PM +0400, Igor Goryachev wrote: > Raimo Niskanen writes: > > > On Fri, Sep 05, 2008 at 10:46:09PM +0400, Igor Goryachev wrote: > >> Hello everybody. > >> > >> It's me again. I've posted this message some time ago, but have not got > >> any response. I confirm that this bug is reproduced (seems to be > >> floating bug) when using R12B-3 version and '-kernel inet_dist_listen_min > >> XXXX inet_dist_listen_max YYYY' when XXXX number equal to YYYY. > > > > Hi Igor. > > > > Yes why not. We see no harm in your patch and will apply it. > > It will be part of the next service release. > > > > Sorry about the slow response to your first mail. > > Hi, Raimo. > > Thank you very much for the response. I do not really understand why > does it occur (well, I have no time for this investigation), but this > tiny patch fixes that weird behaviour. > It *should* occur when an erlang node has a(n) incoming connection(s) to the listen port, and then closes the port either by getting killed by the OS or by itself. Then the TCP port ends up in the infamous state TIME_WAIT aka 2MSL_TIMEOUT which often is approximately 30 seconds. During that time the port can not be bound to. > Good luck. > > > > > >> > >> > >> > > > >> To: erlang-questions@REDACTED > >> From: Igor Goryachev > >> Date: Tue, 05 Feb 2008 18:00:30 +0300 > >> User-Agent: Gnus/5.11 (Gnus v5.11) Emacs/22.1 (gnu/linux) > >> Cc: erlang-bugs@REDACTED > >> Subject: [erlang-bugs] bug: Protocol: "inet_tcp": register/listen error: > >> eaddrinuse while starting a node > >> Xref: goryachev.yandex.ru lists.erlang-questions:11465 lists.erlang-bugs:456 > >> > >> Hi, everyone! > >> > >> I have noticed some troubles while starting node with '-kernel > >> inet_dist_listen_min 5290 inet_dist_listen_max 5290' options. And > >> everything works fine when these options are omitted. > >> > >> The output is here: > >> > >> =========== > >> {progress,preloaded} > >> {progress,kernel_load_completed} > >> {progress,modules_loaded} > >> {start,heart} > >> {start,error_logger} > >> {start,application_controller} > >> {progress,init_kernel_started} > >> {apply,{application,load,[{application,stdlib,[{description,"ERTS CXC 138 10"},{vsn,"1.14.5"},{id,[]},{modules,[base64,beam_lib,c,calendar,dets,dets_server,dets_sup,dets_utils,dets_v8,dets_v9,dict,digraph,digraph_utils,edlin,edlin_expand,epp,eval_bits,erl_bits,erl_compile,erl_eval,erl_expand_records,erl_internal,erl_lint,erl_parse,erl_posix_msg,erl_pp,erl_scan,erl_tar,error_logger_file_h,error_logger_tty_h,escript,ets,file_sorter,filelib,filename,gb_trees,gb_sets,gen,gen_event,gen_fsm,gen_server,io,io_lib,io_lib_format,io_lib_fread,io_lib_pretty,lib,lists,log_mf_h,math,ms_transform,orddict,ordsets,otp_internal,pg,pool,proc_lib,proplists,qlc,qlc_pt,queue,random,regexp,sets,shell,shell_default,slave,sofs,string,supervisor,supervisor_bridge,sys,timer,win32reg,zip]},{registered,[timer_server,rsh_starter,take_over_monitor,pool_master,dets]},{applications,[kernel]},{included_applications,[]},{env,[]},{start_phases,undefined},{maxT,infinity},{maxP,infinity}]}]}} > >> {progress,applications_loaded} > >> {apply,{application,start_boot,[kernel,permanent]}} > >> {error_logger,{{2008,2,5},{17,18,42}},"Protocol: ~p: register/listen error: ~p~n",["inet_tcp",eaddrinuse]} > >> {error_logger,{{2008,2,5},{17,18,42}},crash_report,[[{pid,<0.21.0>},{registered_name,net_kernel},{error_info,{error,badarg}},{initial_call,{gen,init_it,[gen_server,<0.18.0>,<0.18.0>,{local,net_kernel},net_kernel,{'ejabberd@REDACTED',shortnames,15000},[]]}},{ancestors,[net_sup,kernel_sup,<0.8.0>]},{messages,[]},{links,[<0.18.0>]},{dictionary,[{longnames,false}]},{trap_exit,true},{status,running},{heap_size,377},{stack_size,21},{reductions,331}],[]]} > >> {error_logger,{{2008,2,5},{17,18,42}},supervisor_report,[{supervisor,{local,net_sup}},{errorContext,start_error},{reason,{'EXIT',nodistribution}},{offender,[{pid,undefined},{name,net_kernel},{mfa,{net_kernel,start_link,[['ejabberd@REDACTED',shortnames]]}},{restart_type,permanent},{shutdown,2000},{child_type,worker}]}]} > >> {error_logger,{{2008,2,5},{17,18,42}},supervisor_report,[{supervisor,{local,kernel_sup}},{errorContext,start_error},{reason,shutdown},{offender,[{pid,undefined},{name,net_sup},{mfa,{erl_distribution,start_link,[]}},{restart_type,permanent},{shutdown,infinity},{child_type,supervisor}]}]} > >> {error_logger,{{2008,2,5},{17,18,42}},crash_report,[[{pid,<0.7.0>},{registered_name,[]},{error_info,{shutdown,{kernel,start,[normal,[]]}}},{initial_call,{application_master,init,[<0.5.0>,<0.6.0>,{appl_data,kernel,[application_controller,erl_reply,auth,boot_server,code_server,disk_log_server,disk_log_sup,erl_prim_loader,error_logger,file_server_2,fixtable_server,global_group,global_name_server,heart,init,kernel_config,kernel_sup,net_kernel,net_sup,rex,user,os_server,ddll_server,erl_epmd,inet_db,pg2],undefined,{kernel,[]},[application,application_controller,application_master,application_starter,auth,code,code_aux,packages,code_server,dist_util,erl_boot_server,erl_distribution,erl_prim_loader,erl_reply,erlang,error_handler,error_logger,file,file_server,file_io_server,prim_file,global,global_group,global_search,group,heart,hipe_unified_loader,inet6_tcp,inet6_tcp_dist,inet6_udp,inet_config,inet_hosts,inet_gethost_native,inet_tcp_dist,init,kernel,kernel_config,net,net_adm,net_kernel,os,ram_file,rpc,user,user_drv,user_sup,disk_log,disk_log_1,disk_log_server,disk_log_sup,dist_ac,erl_ddll,erl_epmd,erts_debug,gen_tcp,gen_udp,gen_sctp,prim_inet,inet,inet_db,inet_dns,inet_parse,inet_res,inet_tcp,inet_udp,inet_sctp,pg2,seq_trace,wrap_log_reader,zlib,otp_ring0],[],infinity,infinity},normal]}},{ancestors,[<0.6.0>]},{messages,[{'EXIT',<0.8.0>,normal}]},{links,[<0.6.0>,<0.5.0>]},{dictionary,[]},{trap_exit,true},{status,running},{heap_size,987},{stack_size,21},{reductions,2063}],[]]} > >> {apply,{application,start_boot,[stdlib,permanent]}} > >> {error_logger,{{2008,2,5},{17,18,42}},std_info,[{application,kernel},{exited,{shutdown,{kernel,start,[normal,[]]}}},{type,permanent}]} > >> {"Kernel pid terminated",application_controller,"{application_start_failure,kernel,{shutdown,{kernel,start,[normal,[]]}}}"} > >> > >> Crash dump was written to: erl_crash.dump > >> Kernel pid terminated (application_controller) ({application_start_failure,kernel,{shutdown,{kernel,start,[normal,[]]}}}) > >> =========== > >> > >> > >> After a little bit of investigation I've noticed, that there is no > >> '{reuseaddr, true}' socket option in the lib/kernel/src/inet_tcp_dist.erl. If > >> I add this option, recompile inet_tcp_dist.erl, put it into proper > >> place, everything works fine like when mentioned above kernel options > >> are omitted. > >> > >> The tiny patch is included. > >> > >> > >> > >> -- > >> Igor Goryachev > >> Yandex development team. > >> > > > >> --- inet_tcp_dist.erl.orig 2008-02-05 17:28:51.000000000 +0300 > >> +++ inet_tcp_dist.erl 2008-02-05 17:31:24.000000000 +0300 > >> @@ -62,7 +62,7 @@ > >> %% ------------------------------------------------------------ > >> > >> listen(Name) -> > >> - case do_listen([{active, false}, {packet,2}]) of > >> + case do_listen([{active, false}, {packet,2}, {reuseaddr, true}]) of > >> {ok, Socket} -> > >> TcpAddress = get_tcp_address(Socket), > >> {_,Port} = TcpAddress#net_address.address, > > > >> _______________________________________________ > >> erlang-bugs mailing list > >> erlang-bugs@REDACTED > >> http://www.erlang.org/mailman/listinfo/erlang-bugs > > > >> > >> > >> -- > >> Igor Goryachev > >> Yandex development team. > > > >> _______________________________________________ > >> erlang-bugs mailing list > >> erlang-bugs@REDACTED > >> http://www.erlang.org/mailman/listinfo/erlang-bugs > > > > -- > > > > / Raimo Niskanen, Erlang/OTP, Ericsson AB > > > > -- > Igor Goryachev > Yandex development team. -- / Raimo Niskanen, Erlang/OTP, Ericsson AB From ingela@REDACTED Thu Sep 25 17:01:45 2008 From: ingela@REDACTED (Ingela Anderton Andin) Date: Thu, 25 Sep 2008 17:01:45 +0200 Subject: [erlang-bugs] ssh module bugfixes In-Reply-To: <20080925140038.022ab3d7.hio@hio.jp> References: <20080925140038.022ab3d7.hio@hio.jp> Message-ID: <48DBA7D9.9020508@erix.ericsson.se> Hi, thank you for the patch it has been included for the next release. Regards Ingela Erlang/OTP - Ericsson YAMASHINA Hio wrote: > Hi. > > There is fix-bugs patch. > This containts miscellaneous bugs > (typo and erroneous parameters). > > > > ------------------------------------------------------------------------ > > _______________________________________________ > erlang-bugs mailing list > erlang-bugs@REDACTED > http://www.erlang.org/mailman/listinfo/erlang-bugs From ingela@REDACTED Thu Sep 25 17:52:25 2008 From: ingela@REDACTED (Ingela Anderton Andin) Date: Thu, 25 Sep 2008 17:52:25 +0200 Subject: [erlang-bugs] ssh module, enhancement proposal for custom ssh_cli. In-Reply-To: <20080925140046.7e36e6e6.hio@hio.jp> References: <20080925140046.7e36e6e6.hio@hio.jp> Message-ID: <48DBB3B9.1020703@erix.ericsson.se> Hi, We have accepted this patch. We of course want to retain some kind of backwards compatibility with was offered before. The ssh application has been quite extensively remodeled and we are not really a 100 % satisfied with it yet but the circumstances called for a new release of the application and it still might have a few rough corners so to speak. Regards Ingela Erlang/OTP - Ericsson YAMASHINA Hio wrote: > Hi. > > I wanted to write customized ssh daemon application. > But It looks like no way to trap ssh-packets until shell > is started. > (and no way to get ssh_cm instance from shell function.) > > I wrote a patch for writing original ssh-daemon application > by replace ssh_cli with another module which can be selected > by caller. > > % ssh_cli option: > % - module name which implements child_spec/4 > % - or fun/4 > % - default is 'ssh_cli' > % these functions should return > % - fun/0 closure which returns child_pid. > % - or just a child_spec. > > ssh:daemon(22, [ > {ssh_cli, ?MODULE }, > {system_dir, "xxx"}, > {pwdfun, fun pwdfun/2} > ]). > > child_spec(_Shell, _Address, _Port, _Options) -> > ChildSpecFun = fun() -> > ChildPid = spawn(fun() -> > ?MODULE:init() > end), > ChildPid > end, > ChildSpecFun. > > > Like previous ssh_cm:listen. > > ssh_cm:listen( > fun() -> > spawn_link(?MODULE, init, []) > end, > 22, > [ > {system_dir, "."}, > {pwdfun, fun check_auth/2} > ] > ). > > > Thank you. > > > ------------------------------------------------------------------------ > > _______________________________________________ > erlang-bugs mailing list > erlang-bugs@REDACTED > http://www.erlang.org/mailman/listinfo/erlang-bugs From alexey.naidyonov@REDACTED Fri Sep 26 13:23:01 2008 From: alexey.naidyonov@REDACTED (Alexey Naidyonov) Date: Fri, 26 Sep 2008 15:23:01 +0400 Subject: [erlang-bugs] R12B-4: http_chunk:encode(iolist()) produces invalid chunk Message-ID: Hello; http_chunk:encode of inets application is defined as encode(Chunk) when is_list(Chunk)-> HEXSize = http_util:integer_to_hexlist(length(Chunk)), [HEXSize, ?CR, ?LF, Chunk, ?CR, ?LF]. When mod_esi:deliver is called with iolist() containing binaries, http_chink:encode produces an invalid chunk, e.g. mod_esi:deliver(SessionID, [<<"abcd">>, <<"abcd">>]). yields: 2 abcdabcd I believe length/1 should be replaced with erlang:iolist_size/1, i.e. encode(Chunk) when is_list(Chunk)-> HEXSize = http_util:integer_to_hexlist(erlang:iolist_size(Chunk)), [HEXSize, ?CR, ?LF, Chunk, ?CR, ?LF]. this produces correct 8 abcdabcd SY, -- Alexey Naidyonov -------------- next part -------------- An HTML attachment was scrubbed... URL: From ingela@REDACTED Fri Sep 26 13:32:38 2008 From: ingela@REDACTED (Ingela Anderton Andin) Date: Fri, 26 Sep 2008 13:32:38 +0200 Subject: [erlang-bugs] ssh module v6 connect problem In-Reply-To: <20080925140042.6b806c73.hio@hio.jp> References: <20080925140042.6b806c73.hio@hio.jp> Message-ID: <48DCC856.2050500@erix.ericsson.se> Hi, Thanks for reporting this, it has been fixed but in another way. Regards Ingela Erlang/OTP - Ericsson YAMASHINA Hio wrote: > Hi. > > I found a bug around connecting to v6 host. > > Host = "localhost", > Port = 22, > Ret2 = ssh:connect(Host, Port, [ > {user_dir, "xxx" }, > {user, "xxx" }, > {password, "xxx" }, > {silently_accept_hosts, true} > ]), > > Above invocation on ip-v6 enabled box will be:: > > Callback:connect(Address, Port, [{active, false} | SocketOpts], > Timeout) > where Address = {0,0,0,0, 0,0,0,1}, > SocketOpts = [], > Callback = gen_tcp > > This will fail with {error,nxdomain}. > Because SocketOpts does not contain inet6 option. > > This problem should be fixed in gen_tcp module? > > > > > ------------------------------------------------------------------------ > > _______________________________________________ > erlang-bugs mailing list > erlang-bugs@REDACTED > http://www.erlang.org/mailman/listinfo/erlang-bugs