From kwidoyo@REDACTED Tue Mar 1 08:24:28 2011 From: kwidoyo@REDACTED (Kustarto Widoyo) Date: Tue, 01 Mar 2011 16:24:28 +0900 Subject: [erlang-bugs] Segmentation fault in erts_printf_char (R13B04) In-Reply-To: <4D4BA635.8050001@geminimobile.com> References: <4D4BA2F3.2080405@geminimobile.com> <4D4BA635.8050001@geminimobile.com> Message-ID: <4D6C9F2C.6020606@geminimobile.com> Could someone help to take a look at this issue? Regards, Widoyo From pan@REDACTED Tue Mar 1 11:54:28 2011 From: pan@REDACTED (pan@REDACTED) Date: Tue, 1 Mar 2011 11:54:28 +0100 Subject: [erlang-bugs] Segmentation fault in erts_printf_char (R13B04) In-Reply-To: <4D4BA2F3.2080405@geminimobile.com> References: <4D4BA2F3.2080405@geminimobile.com> Message-ID: Not much to go on here, could be anything. Could you supply the rest of the stack, i.e. who calls the print_term function? Also, by using the etp-commands gdb macros (in source tree, $ERL_TOP/erts/etc/unix/), you could see the term that's being printed. Is it corrupted? I would also try increasing the schedulers stacksize (erl +sss Value), as it could be a stack overrun (the parameters to the functions seem ok, but that's a wild guess). /Patrik On Fri, 4 Feb 2011, Kustarto Widoyo wrote: > Hi All, > > We found that our application was crashed and core dump file was generated, > but not found any erl_crash.dump file created. > > * The application uses erlang distribution protocol that more than 50 nodes > are involving in. All nodes run on Redhat 5.3. > > * We're using Erlang R13B04 64 bit built from source and the following > patches have been applied: > otp_src_R13B04-OTP-8475.patch > otp_src_R13B04-OTP-8612.patch > otp_src_R13B04-OTP-8643.patch > otp_src_R13B04-OTP-8658.patch > otp_src_R13B04-OTP-8661.patch > otp_src_R13B04-OTP-8662.patch > otp_src_R13B04-beam-break.patch > otp_src_R13B04-emacs.patch > otp_src_R13B04-erl_poll.patch > otp_src_R13B04-erts_de_busy_limit.patch > otp_src_R13B04-eunit.patch > otp_src_R13B04-httpc-memoryleak.patch > otp_src_R13B04-patch-etop.patch > otp_src_R13B04-patch-odbc-oracleworkaround.patch > otp_src_R13B04-supervisor.patch > > * The erl command line we use: > $ /usr/local/gemini/ert/R13B04/lib/erlang/erts-5.7.5/bin/beam.smp -A 64 -K > true -S 0 -- -root /usr/local/gemini/ert/R13B04/lib/erlang -progname erl -- > -home /export/home/mmssys -- -smp enable -noshell -noinput -noshell -sname > gdss1 -kernel net_ticktime 20 -boot > /usr/local/gemini/gdss/1.0.0/lib/app/gdss_all -config > /usr/local/gemini/gdss/1.0.0/../var/data/node1.config -pa > /usr/local/gemini/gdss/1.0.0/lib/app-patches -pz > /usr/local/gemini/gdss/1.0.0/lib/app -pz /usr/local/gemini/gdss/1.0.0/lib > -central_config /usr/local/gemini/gdss/1.0.0/etc/central.conf > -ticket_broker_config /usr/local/gemini/gdss/1.0.0/etc/broker.conf > > > * The following is gdb and backtrace output: > ------------------------ > [root@REDACTED data]# gdb > /usr/local/gemini/ert/R13B04/lib/erlang/erts-5.7.5/bin/beam.smp core.14720 > GNU gdb Fedora (6.8-27.el5) > Copyright (C) 2008 Free Software Foundation, Inc. > License GPLv3+: GNU GPL version 3 or later > This is free software: you are free to change and redistribute it. > There is NO WARRANTY, to the extent permitted by law. Type "show copying" > and "show warranty" for details. > This GDB was configured as "x86_64-redhat-linux-gnu"... > Reading symbols from /lib64/libutil.so.1...done. > Loaded symbols for /lib64/libutil.so.1 > Reading symbols from /lib64/libdl.so.2...done. > Loaded symbols for /lib64/libdl.so.2 > Reading symbols from /lib64/libm.so.6...done. > Loaded symbols for /lib64/libm.so.6 > Reading symbols from /usr/lib64/libncurses.so.5...done. > Loaded symbols for /usr/lib64/libncurses.so.5 > Reading symbols from /lib64/libpthread.so.0...done. > Loaded symbols for /lib64/libpthread.so.0 > Reading symbols from /lib64/librt.so.1...done. > Loaded symbols for /lib64/librt.so.1 > Reading symbols from /lib64/libc.so.6...done. > Loaded symbols for /lib64/libc.so.6 > Reading symbols from /lib64/ld-linux-x86-64.so.2...done. > Loaded symbols for /lib64/ld-linux-x86-64.so.2 > Reading symbols from > /usr/local/gemini/ert/R13B04/lib/erlang/lib/crypto-1.6.4/priv/lib/crypto_drv.so...done. > Loaded symbols for > /usr/local/gemini/ert/R13B04/lib/erlang/lib/crypto-1.6.4/priv/lib/crypto_drv.so > Reading symbols from > /usr/local/gemini/ert/R13B04/openssl/lib/libcrypto.so.0.9.8...done. > Loaded symbols for > /usr/local/gemini/ert/R13B04/openssl/lib/libcrypto.so.0.9.8 > Core was generated by > `/usr/local/gemini/ert/R13B04/lib/erlang/erts-5.7.5/bin/beam.smp -A 64 -K > true -'. > Program terminated with signal 11, Segmentation fault. > [New process 14798] > > snip ... snip ... snip ... > > #0 0x0000000000585ea8 in erts_printf_char (fn=0x5880f0 , > arg=0xb750930, c=91 '[') at common/erl_printf_format.c:840 > 840 common/erl_printf_format.c: No such file or directory. > in common/erl_printf_format.c > (gdb) bt > #0 0x0000000000585ea8 in erts_printf_char (fn=0x5880f0 , > arg=0xb750930, c=91 '[') at common/erl_printf_format.c:840 > #1 0x000000000048c6d8 in print_term (fn=0x5880f0 , arg=0xb750930, > obj=5800176, dcount=0x458f2bc8) > at beam/erl_printf_term.c:346 > #2 0x000000000048c740 in print_term (fn=0x5880f0 , arg=0xb750930, > obj=, dcount=0x458f2bc8) > at beam/erl_printf_term.c:349 > #3 0x000000000048c740 in print_term (fn=0x5880f0 , arg=0xb750930, > obj=, dcount=0x458f2bc8) > at beam/erl_printf_term.c:349 > > snip ... snip ... snip ... > ------------------------ > > * And in syslog, we found: > Jan 27 23:48:21 gds001c kernel: beam.smp[14798]: segfault at 0000000044ef3ff8 > rip 0000000000585ea8 rsp 0000000044ef4000 error 6 > > Please let me know, if there is anything else we have to provide. > > Regards, > -- > Kustarto Widoyo > Gemini Mobile Technologies > > ________________________________________________________________ > erlang-bugs (at) erlang.org mailing list. > See http://www.erlang.org/faq.html > To unsubscribe; mailto:erlang-bugs-unsubscribe@REDACTED > From dave@REDACTED Fri Mar 4 11:53:58 2011 From: dave@REDACTED (Dave Cottlehuber) Date: Fri, 4 Mar 2011 23:53:58 +1300 Subject: erl.exe dies but werl.exe does not on both Windows XP and 2008R2 with R14B01 Message-ID: Hallo, There are 2 issues I've identified - VM crash & VM hang. Both occur within a CouchDB build of erlang, on various windows variants. This email covers the crash only. It is easy to reproduce: Install CouchDB 1.0.1 or a more recent build from https://github.com/dch/couchdb/downloads and curl.exe from http://haxx.se/ open command prompt and change to couchdb/bin folder. set erl=erl couchdb.bat & run this script until erlang hangs (watch the erl console scroll by!). On my 2 testbeds this takes less than a minute to occur - just 25 curls. ::restart_couch.cmd @echo off :restart for /l %%i in (1,1,100000000000) do @call :curl %%i goto :eof :curl curl -v -H "Content-Type: application/json" -X POST http://localhost:5984/_restart :: check to see if couch died horribly if exist erl_crash.dump echo Woops!!!! && move /y erl_crash.dump erl_crash.dump.%1 goto :eof =erl_crash_dump:0.1 Fri Mar 04 23:48:06 2011 Slogan: Kernel pid terminated (application_controller) ({application_start_failure,kernel,{shutdown,{kernel,start,[normal,[]]}}}) System version: Erlang R14B01 (erts-5.8.2) [source] [smp:2:2] [rq:2] [async-threads:4] Compiled: Sat Feb 12 23:25:21 2011 The full .dump can be found at http://friendpaste.com/1gMbN0i2zn3mlaHAC54D58 1> [replicator] max_http_sessions="20" 1> [replicator] ssl_certificate_max_depth="3" 1> [replicator] verify_ssl_certificates="false" 1> [stats] rate="1000" 1> [stats] samples="[0, 60, 300, 900]" 1> [uuids] algorithm="sequential" 1> Apache CouchDB has started. Time to relax. 1> [info] [<0.689.0>] Apache CouchDB has started on http://0.0.0.0:5984/ 1> [debug] [<0.761.0>] 'POST' /_restart {1,1} Headers: [{'Accept',"*/*"}, {'Content-Type',"application/json"}, {'Host',"localhost:5984"}, {'User-Agent',"curl/7.19.0 (i586-pc-mingw32msvc) libcurl/7.19.0 OpenSSL/1.0.0c zlib/1.2.3" }] 1> [debug] [<0.761.0>] OAuth Params: [] 1> [info] [<0.761.0>] 127.0.0.1 - - 'POST' /_restart 200 1> {error_logger,{{2011,3,4},{23,48,5}},crash_report,[[{initial_call,{supervisor_bridge,user_sup,['Argument__1']}},{pid,<0.785.0>},{registered_name,[]},{error_info,{exit,nouser,[{gen_server,init_it,6},{proc_lib,init_p_do_apply,3}]}},{ancestors,[kernel_sup,<0.773.0>]},{messages,[]},{links,[<0.774.0>]},{dictionary,[]},{trap_exit,true},{status,running},{heap_size,610},{stack_size,24},{reductions,338}],[]]} {error_logger,{{2011,3,4},{23,48,5}},supervisor_report,[{supervisor,{local,kernel_sup}},{errorContext,start_error},{reason,nouser},{offender,[{pid,undefined},{name,user},{mfargs,{user_sup,start,[]}},{restart_type,temporary},{shutdown,2000},{child_type,supervisor}]}]} {error_logger,{{2011,3,4},{23,48,5}},std_info,[{application,kernel},{exited,{shutdown,{kernel,start,[normal,[]]}}},{type,permanent}]} {"Kernel pid terminated",application_controller,"{application_start_failure,kernel,{shutdown,{kernel,start,[normal,[]]}}}"} Crash dump was written to: erl_crash.dump Kernel pid terminated (application_controller) ({application_start_failure,kernel,{shutdown,{kernel,start,[normal,[]]}}}) Can anybody help clarify why this happens, and what we can do about it? Thanks Dave From dave@REDACTED Fri Mar 4 11:57:27 2011 From: dave@REDACTED (Dave Cottlehuber) Date: Fri, 4 Mar 2011 23:57:27 +1300 Subject: erl.exe dies but werl.exe does not on both Windows XP and 2008R2 with R14B01 In-Reply-To: References: Message-ID: On 4 March 2011 23:53, Dave Cottlehuber wrote: > Hallo, > > There are 2 issues I've identified - VM crash & VM hang. Both occur > within a CouchDB build of erlang, on various windows variants. This > email covers the crash only. It is easy to reproduce: > > Install CouchDB 1.0.1 or a more recent build from > https://github.com/dch/couchdb/downloads and curl.exe from > http://haxx.se/ > open command prompt and change to couchdb/bin folder. > set erl=erl > couchdb.bat > > & run this script until erlang dies (watch the erl console scroll > by!). On my 2 testbeds this takes less than a minute to occur - just > 25 curls. Sorry; key point is that running with werl.exe instead will run successfully for days. More info on the original issue is available at https://issues.apache.org/jira/browse/COUCHDB-963. Thanks again. Dave From dave@REDACTED Fri Mar 4 12:11:35 2011 From: dave@REDACTED (Dave Cottlehuber) Date: Sat, 5 Mar 2011 00:11:35 +1300 Subject: erl.exe dies but werl.exe does not on both Windows XP and 2008R2 with R14B01 In-Reply-To: References: Message-ID: On 4 March 2011 23:53, Dave Cottlehuber wrote: > Hallo, > > There are 2 issues I've identified - VM crash & VM hang. Both occur > within a CouchDB build of erlang, on various windows variants. This > email covers the crash only. It is easy to reproduce: The 2nd issue is related to VM hang, under erlsrv service account & not batch file. I am using following erlsrv configurations; the debug one provides a visible console so easier to work with but same issue occurs with either service configuration. debug: erlsrv.exe add "CouchDeBug" -workdir "c:\couch\couchdb-1.0.2\bin" -onfail restart_always -debugtype console -args "-sasl errlog_type error -s couch +A 4 +W w" -comment "CouchDeBug" -machine "c:\couch\CouchDB-1.0.2\erts-5.8.2\bin\erl.exe" new: erlsrv.exe add "NewCouch" -workdir "c:\couch\couchdb-1.0.2\bin" -onfail restart_always -args "-sasl errlog_type error -s couch +A 4 +W w" -comment "NewCouch" -machine "c:\couch\CouchDB-1.0.2\erts-5.8.2\bin\erl.exe" erlsrv starts erl.exe not werl.exe and so the issue noted in previous email crops up and the application is down. 2 questions - Why does erlsrv not use werl.exe ? it is technically possible to pass erlsrv the -machine parameter with werl.exe, and it runs successfully as a service avoiding the original issue. The erlang/OTP source confirms that this is bad - but why? 2nd part. using erlsrv "-restart_always" couchdb restarts very quickly. but after up to 8h of continuous curl _restart, the erl.exe window definitely hangs - no input accepted - as if the REPL loop is over. Any ideas why erlang seems to hang around init:restart() and what we can do about it? Do you want any more information? Thanks Dave From magnus.henoch@REDACTED Fri Mar 4 15:39:15 2011 From: magnus.henoch@REDACTED (Magnus Henoch) Date: Fri, 4 Mar 2011 14:39:15 +0000 (GMT) Subject: Can't run mnesia:first on empty fragmented table In-Reply-To: <2056185093.24901299249121046.JavaMail.root@zimbra> Message-ID: <642444943.25071299249555440.JavaMail.root@zimbra> Hi all, When I run mnesia:first on an empty fragmented table, it tries to access the fragment with the number one beyond the maximum. In the sample code below, I create a table with two fragments, 'foo' and 'foo_frag2', but mnesia tries to access 'foo_frag3': -module(foo). -compile(export_all). foo() -> net_kernel:start([foo, shortnames]), application:start(mnesia), {atomic, ok} = mnesia:create_table(foo, []), %% activate fragmentation {atomic, ok} = mnesia:change_table_frag(foo, {activate, []}), %% add a second fragment on this node {atomic, ok} = mnesia:change_table_frag(foo, {add_frag, [node()]}), io:format("Our table is fragmented:~n~p~n", [mnesia:table_info(foo, all)]), io:format("Now let's run mnesia:first. We expect to get ~p.~n~p~n", ['$end_of_table', %% but we get {'EXIT',{aborted,{no_exists,[foo_frag3]}}} catch mnesia:activity(sync_dirty, fun() -> mnesia:first(foo) end, [], mnesia_frag)]). It looks like a simple off-by-one error in mnesia_frag:search_first. Changing the guard from '=<' to '<' as in the patch below fixes my test case (and the real system I distilled it from), but I'd appreciate a second opinion. Regards, Magnus diff --git a/lib/mnesia/src/mnesia_frag.erl b/lib/mnesia/src/mnesia_frag.erl index a2958ab..d33dafe 100644 --- a/lib/mnesia/src/mnesia_frag.erl +++ b/lib/mnesia/src/mnesia_frag.erl @@ -209,7 +209,7 @@ first(ActivityId, Opaque, Tab) -> end end. -search_first(ActivityId, Opaque, Tab, N, FH) when N =< FH#frag_state.n_fragments -> +search_first(ActivityId, Opaque, Tab, N, FH) when N < FH#frag_state.n_fragments -> NextN = N + 1, NextFrag = n_to_frag_name(Tab, NextN), case mnesia:first(ActivityId, Opaque, NextFrag) of From pan@REDACTED Fri Mar 4 16:41:03 2011 From: pan@REDACTED (pan@REDACTED) Date: Fri, 4 Mar 2011 16:41:03 +0100 Subject: [erlang-bugs] Re: erl.exe dies but werl.exe does not on both Windows XP and 2008R2 with R14B01 In-Reply-To: References: Message-ID: Hi, On Sat, 5 Mar 2011, Dave Cottlehuber wrote: > On 4 March 2011 23:53, Dave Cottlehuber wrote: >> Hallo, >> >> There are 2 issues I've identified - VM crash & VM hang. Both occur >> within a CouchDB build of erlang, on various windows variants. This >> email covers the crash only. It is easy to reproduce: > > The 2nd issue is related to VM hang, under erlsrv service account & > not batch file. I am using following erlsrv configurations; the debug > one provides a visible console so easier to work with but same issue > occurs with either service configuration. > > debug: erlsrv.exe add "CouchDeBug" -workdir > "c:\couch\couchdb-1.0.2\bin" -onfail restart_always -debugtype console > -args "-sasl errlog_type error -s couch +A 4 +W w" -comment > "CouchDeBug" -machine "c:\couch\CouchDB-1.0.2\erts-5.8.2\bin\erl.exe" > > new: erlsrv.exe add "NewCouch" -workdir "c:\couch\couchdb-1.0.2\bin" > -onfail restart_always -args "-sasl errlog_type error -s couch +A 4 +W > w" -comment "NewCouch" -machine > "c:\couch\CouchDB-1.0.2\erts-5.8.2\bin\erl.exe" > > erlsrv starts erl.exe not werl.exe and so the issue noted in previous > email crops up and the application is down. I think it may be the same issue. We're investigating the batch file issue to start with. The problem is easy to reproduce - very nice. > > 2 questions - > > Why does erlsrv not use werl.exe ? it is technically possible to pass > erlsrv the -machine parameter with werl.exe, and it runs successfully > as a service avoiding the original issue. The erlang/OTP source > confirms that this is bad - but why? Werl cannot handle everything the erlsrv program wants to do to the machine, like stopactions, killing by signalling etc. Fixing erl so it does not hang is the easiest and best thing to do here. > > 2nd part. using erlsrv "-restart_always" couchdb restarts very > quickly. but after up to 8h of continuous curl _restart, the erl.exe > window definitely hangs - no input accepted - as if the REPL loop is > over. > > Any ideas why erlang seems to hang around init:restart() and what we > can do about it? Do you want any more information? When running erl on Windows you get the "old shell", meaning that another io-server is running and also a special driver (the fd-driver) is used. The fd-driver is emulatong Unix behaviour and might be the cause of all the problems, but the actual user.erl code might also be broken. I'll debug it, find the first problem and get back to you when I've narrowed it down! > Thanks > Dave Cheers, /Patrik > > ________________________________________________________________ > erlang-bugs (at) erlang.org mailing list. > See http://www.erlang.org/faq.html > To unsubscribe; mailto:erlang-bugs-unsubscribe@REDACTED > From dangud@REDACTED Mon Mar 7 16:07:49 2011 From: dangud@REDACTED (Dan Gudmundsson) Date: Mon, 7 Mar 2011 16:07:49 +0100 Subject: [erlang-bugs] Can't run mnesia:first on empty fragmented table In-Reply-To: <642444943.25071299249555440.JavaMail.root@zimbra> References: <2056185093.24901299249121046.JavaMail.root@zimbra> <642444943.25071299249555440.JavaMail.root@zimbra> Message-ID: Looks correct to me. I will include it directly. /Dan On Fri, Mar 4, 2011 at 3:39 PM, Magnus Henoch wrote: > Hi all, > > When I run mnesia:first on an empty fragmented table, it tries to > access the fragment with the number one beyond the maximum. ?In the > sample code below, I create a table with two fragments, 'foo' and > 'foo_frag2', but mnesia tries to access 'foo_frag3': > > -module(foo). > > -compile(export_all). > > foo() -> > ? ?net_kernel:start([foo, shortnames]), > ? ?application:start(mnesia), > ? ?{atomic, ok} = mnesia:create_table(foo, []), > ? ?%% activate fragmentation > ? ?{atomic, ok} = mnesia:change_table_frag(foo, {activate, []}), > ? ?%% add a second fragment on this node > ? ?{atomic, ok} = mnesia:change_table_frag(foo, {add_frag, [node()]}), > > ? ?io:format("Our table is fragmented:~n~p~n", [mnesia:table_info(foo, all)]), > > ? ?io:format("Now let's run mnesia:first. ?We expect to get ~p.~n~p~n", > ? ? ? ? ? ? ?['$end_of_table', > ? ? ? ? ? ? ? %% but we get {'EXIT',{aborted,{no_exists,[foo_frag3]}}} > ? ? ? ? ? ? ? catch mnesia:activity(sync_dirty, > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? fun() -> mnesia:first(foo) end, > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? [], > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? mnesia_frag)]). > > It looks like a simple off-by-one error in mnesia_frag:search_first. > Changing the guard from '=<' to '<' as in the patch below fixes my > test case (and the real system I distilled it from), but I'd > appreciate a second opinion. > > Regards, > Magnus > > > diff --git a/lib/mnesia/src/mnesia_frag.erl b/lib/mnesia/src/mnesia_frag.erl > index a2958ab..d33dafe 100644 > --- a/lib/mnesia/src/mnesia_frag.erl > +++ b/lib/mnesia/src/mnesia_frag.erl > @@ -209,7 +209,7 @@ first(ActivityId, Opaque, Tab) -> > ? ? ? ? ? ?end > ? ? end. > > -search_first(ActivityId, Opaque, Tab, N, FH) when N =< FH#frag_state.n_fragments -> > +search_first(ActivityId, Opaque, Tab, N, FH) when N < FH#frag_state.n_fragments -> > ? ? NextN = N + 1, > ? ? NextFrag = n_to_frag_name(Tab, NextN), > ? ? case mnesia:first(ActivityId, Opaque, NextFrag) of > > ________________________________________________________________ > erlang-bugs (at) erlang.org mailing list. > See http://www.erlang.org/faq.html > To unsubscribe; mailto:erlang-bugs-unsubscribe@REDACTED > > From kwidoyo@REDACTED Tue Mar 8 09:13:56 2011 From: kwidoyo@REDACTED (Kustarto Widoyo) Date: Tue, 08 Mar 2011 17:13:56 +0900 Subject: [erlang-bugs] Segmentation fault in erts_printf_char (R13B04) In-Reply-To: References: <4D4BA2F3.2080405@geminimobile.com> Message-ID: <4D75E544.6080209@geminimobile.com> > Could you supply the rest of the stack, i.e. who calls the print_term > function? The following is about the rest of the stack?(result of thread apply all bt). ... Thread 1 (process 14798): #0 0x0000000000585ea8 in erts_printf_char (fn=0x5880f0 , arg=0xb750930, c=91 '[') at common/erl_printf_format.c:840 #1 0x000000000048c6d8 in print_term (fn=0x5880f0 , arg=0xb750930, obj=5800176, dcount=0x458f2bc8) at beam/erl_printf_term.c:346 #2 0x000000000048c740 in print_term (fn=0x5880f0 , arg=0xb750930, obj=, dcount=0x458f2bc8) at beam/erl_printf_term.c:349 #3 0x000000000048c740 in print_term (fn=0x5880f0 , arg=0xb750930, obj=, dcount=0x458f2bc8) at beam/erl_printf_term.c:349 #4 0x000000000048c740 in print_term (fn=0x5880f0 , arg=0xb750930, obj=, dcount=0x458f2bc8) at beam/erl_printf_term.c:349 ....snip.... #43669 0x000000000048c740 in print_term (fn=0x5880f0 , arg=0xb750930, obj=, dcount=0x458f2bc8) at beam/erl_printf_term.c:349 #43670 0x000000000048cf48 in erts_printf_term (fn=0xb750930, arg=0x44ef4004, term=1, precision=56331) at beam/erl_printf_term.c:452 #43671 0x0000000000586b9c in erts_printf_format (fn=0x5880f0 , arg=0xb750930, fmt=, ap=0x458f2cd0) at common/erl_printf_format.c:813 #43672 0x0000000000587df4 in erts_vdsprintf (dsbufp=0xb750930, format=0x591897 " %T\n", arglist=0x458f2cd0) at common/erl_printf.c:419 #43673 0x0000000000470fe0 in erts_print (to=, arg=0x1, format=0x1c2510
) at beam/utils.c:299 #43674 0x00000000004921a3 in erts_program_counter_info (to=-4, to_arg=0xb750930, p=0x2aaabb0ae630) at beam/erl_process.c:8387 #43675 0x00000000004921f5 in erts_stack_dump (to=-4, to_arg=0xb750930, p=0x1) at beam/erl_process.c:8360 #43676 0x00000000004e9191 in print_process_info (to=-4, to_arg=0xb750930, p=0x2aaabb0ae630) at beam/break.c:343 #43677 0x00000000004e9384 in process_info (to=-4, to_arg=0xb750930) at beam/break.c:79 #43678 0x0000000000458744 in system_info_1 (A__p=0x2aaabb0b6ca0, A_1=24459) at beam/erl_bif_info.c:1973 #43679 0x000000000051279a in process_main () at beam/beam_emu.c:2087 #43680 0x000000000049fac2 in sched_thread_func (vesdp=) at beam/erl_process.c:3060 #43681 0x0000000000585314 in thr_wrapper (vtwd=) at common/ethread.c:480 #43682 0x0000003383006367 in start_thread () from /lib64/libpthread.so.0 #43683 0x00000033824d309d in clone () from /lib64/libc.so.6 > Also, by using the etp-commands gdb macros (in source tree, > $ERL_TOP/erts/etc/unix/), you could see the term that's being printed. > Is it corrupted? Sorry, I am still not able to use it. Thanks, Widoyo From pan@REDACTED Tue Mar 8 14:01:32 2011 From: pan@REDACTED (pan@REDACTED) Date: Tue, 8 Mar 2011 14:01:32 +0100 Subject: [erlang-bugs] Segmentation fault in erts_printf_char (R13B04) In-Reply-To: <4D75E41C.1080502@geminimobile.com> References: <4D4BA2F3.2080405@geminimobile.com> <4D75E41C.1080502@geminimobile.com> Message-ID: Hi! Interesting, seems that the system_info bif barfs... It would be interesting to see a printout of the parameters to system_info_1, the first parameter should be a pointer to a process structure and the second an erlang term (you print it with etp A__1). Do you have any possibility to provide me with the core and your build of the VM, together with the source? Cheers, /Patrik On Tue, 8 Mar 2011, Kustarto Widoyo wrote: > >> Could you supply the rest of the stack, i.e. who calls the print_term >> function? > > Please find the attached file. It's about the rest of the stack. > > Thread 1 (process 14798): > #0 0x0000000000585ea8 in erts_printf_char (fn=0x5880f0 , > arg=0xb750930, c=91 '[') at common/erl_printf_format.c:840 > #1 0x000000000048c6d8 in print_term (fn=0x5880f0 , arg=0xb750930, > obj=5800176, dcount=0x458f2bc8) at beam/erl_printf_term.c:346 > #2 0x000000000048c740 in print_term (fn=0x5880f0 , arg=0xb750930, > obj=, dcount=0x458f2bc8) at > beam/erl_printf_term.c:349 > #3 0x000000000048c740 in print_term (fn=0x5880f0 , arg=0xb750930, > obj=, dcount=0x458f2bc8) at > beam/erl_printf_term.c:349 > #4 0x000000000048c740 in print_term (fn=0x5880f0 , arg=0xb750930, > obj=, dcount=0x458f2bc8) at > beam/erl_printf_term.c:349 > > ....snip.... > > #43669 0x000000000048c740 in print_term (fn=0x5880f0 , > arg=0xb750930, obj=, dcount=0x458f2bc8) > at beam/erl_printf_term.c:349 > #43670 0x000000000048cf48 in erts_printf_term (fn=0xb750930, arg=0x44ef4004, > term=1, precision=56331) at beam/erl_printf_term.c:452 > #43671 0x0000000000586b9c in erts_printf_format (fn=0x5880f0 , > arg=0xb750930, fmt=, ap=0x458f2cd0) > at common/erl_printf_format.c:813 > #43672 0x0000000000587df4 in erts_vdsprintf (dsbufp=0xb750930, > format=0x591897 " %T\n", arglist=0x458f2cd0) at common/erl_printf.c:419 > #43673 0x0000000000470fe0 in erts_print (to=, arg=0x1, > format=0x1c2510
) at beam/utils.c:299 > #43674 0x00000000004921a3 in erts_program_counter_info (to=-4, > to_arg=0xb750930, p=0x2aaabb0ae630) at beam/erl_process.c:8387 > #43675 0x00000000004921f5 in erts_stack_dump (to=-4, to_arg=0xb750930, p=0x1) > at beam/erl_process.c:8360 > #43676 0x00000000004e9191 in print_process_info (to=-4, to_arg=0xb750930, > p=0x2aaabb0ae630) at beam/break.c:343 > #43677 0x00000000004e9384 in process_info (to=-4, to_arg=0xb750930) > at beam/break.c:79 > #43678 0x0000000000458744 in system_info_1 (A__p=0x2aaabb0b6ca0, A_1=24459) > at beam/erl_bif_info.c:1973 > #43679 0x000000000051279a in process_main () at beam/beam_emu.c:2087 > #43680 0x000000000049fac2 in sched_thread_func (vesdp=) > at beam/erl_process.c:3060 > #43681 0x0000000000585314 in thr_wrapper (vtwd=) > at common/ethread.c:480 > #43682 0x0000003383006367 in start_thread () from /lib64/libpthread.so.0 > #43683 0x00000033824d309d in clone () from /lib64/libc.so.6 > >> Also, by using the etp-commands gdb macros (in source tree, >> $ERL_TOP/erts/etc/unix/), you could see the term that's being printed. >> Is it corrupted? > > Sorry, I am still not able to use it. > > Thanks, > Widoyo > > > From gopienko@REDACTED Thu Mar 10 08:39:49 2011 From: gopienko@REDACTED (Andrew Gopienko) Date: Thu, 10 Mar 2011 13:39:49 +0600 Subject: Relup instructions order Message-ID: Erlang R14B01, Ubuntu 10.10 x86 box gproc.appup file ---------------------------------------------------- %% appup generated for gproc by rebar ("2011/03/10 13:15:09") {"0.1.1", [{"0.01", [{delete_module,gproc_eqc},{update,gproc,{advanced,[]}}]}], [{"0.01", []}] }. relup file ------------------------------------------------------- {"0.5.5", [{"0.5.4",[], [{load_object_code,{gproc,"0.1.1",[gproc]}}, point_of_no_return, {remove,{gproc_eqc,brutal_purge,brutal_purge}}, {purge,[gproc_eqc]}, {suspend,[gproc]}, {load,{gproc,brutal_purge,brutal_purge}}, {code_change,up,[{gproc,[]}]}, {resume,[gproc]}]}], [{"0.5.4",[],[point_of_no_return]}]}. After evaluating instruction 'point_of_no_return' the library path updated to new location and instruction 'remove' crashed in call release_handler_1:get_vsn(non_existing=code:which(gproc_eqc)). release_handler:upgrade_app also crash with same Reason. (flm@REDACTED)1> code:which(gproc_eqc). "/home/tdx/devel/fleetm/rel/flm_0.5.4/lib/gproc-0.01/ebin/gproc_eqc.beam" (flm@REDACTED)2> release_handler:upgrade_app(gproc, "../flm/lib/gproc-0.1.1"). {'EXIT',{'EXIT',{{badmatch,{error,beam_lib, {file_error,"non_existing.beam",enoent}}}, [{release_handler_1,get_vsn,1}, {release_handler_1,add_old_vsn,2}, {release_handler_1,eval,2}, {lists,foldl,3}, {release_handler_1,eval_script,4}, {release_handler,eval_appup_script,4}, {erl_eval,do_apply,5}, {shell,exprs,7}]}}} (flm@REDACTED)3> code:which(gproc_eqc). non_existing After reordering instructions in relup and place 'remove' before 'point_of_no_return' release upgraded successfully. From bernie@REDACTED Fri Mar 11 00:16:39 2011 From: bernie@REDACTED (Bernard Duggan) Date: Fri, 11 Mar 2011 10:16:39 +1100 Subject: Variable incorrectly unbound when bound and used in binary match Message-ID: <4D795BD7.4010708@m5net.com> Reposting this (with slight modifications) that was posed in the questions list. It seemed presumptuous to go straight to the bugs list, but the consensus seems to be that that's what this is: So I've just run into an interesting little bit of behaviour that doesn't seem quite right. In the following code: ----------------------------------------- -module(casetest). -export([test/0]). test() -> match(<<1, 2, 3, 4, 5, 6, 7, 8>>). match(<>) -> case A of B -> wrong; _ -> ok end. ----------------------------------------- erlc gives me the warning ./casetest.erl:11: Warning: this clause cannot match because a previous clause at line 10 always matches (line 10 is the "B -> wrong;" line). And sure enough, if you run test/0 you get 'wrong' back. That, in itself, is curious to me since by my understanding B should be bound by the function header, and have no guarantee of being the same as A. I can't see how it could be unbound. Doubly curious, is that if I stop using B as the size specifier of C, like this: match(<>) -> The warning goes away. And the result becomes 'ok' (in spite of nothing in the body having changed, and the only thing changing in the header being the size of an unused variable at the tail of the binary). Similarly, if I change the body of match/1 to this: Z = B, case A of Z -> wrong; _ -> ok end. It also works. So, yeah, it kinda looks like a bug. Cheers, Bernard From g@REDACTED Fri Mar 11 23:30:00 2011 From: g@REDACTED (Garrett Smith) Date: Fri, 11 Mar 2011 16:30:00 -0600 Subject: Missing ssl_certificate_key_file in docs Message-ID: In http://www.erlang.org/doc/man/httpd.html there's no mention of ssl_certificate_key_file, which is required for SSL support. Unless there's another way to configure the key, that should also be a required option when ssl is specified (as is the case for ssl_certificate_key_file). Garrett From g@REDACTED Fri Mar 11 23:39:11 2011 From: g@REDACTED (Garrett Smith) Date: Fri, 11 Mar 2011 16:39:11 -0600 Subject: Missing ssl_certificate_key_file in docs In-Reply-To: References: Message-ID: On Fri, Mar 11, 2011 at 4:30 PM, Garrett Smith wrote: > In http://www.erlang.org/doc/man/httpd.html there's no mention of > ssl_certificate_key_file, which is required for SSL support. > > Unless there's another way to configure the key, that should also be a > required option when ssl is specified (as is the case for > ssl_certificate_key_file). Er, as is the case for ssl_certificate_file. From g@REDACTED Sat Mar 12 04:01:45 2011 From: g@REDACTED (Garrett Smith) Date: Fri, 11 Mar 2011 21:01:45 -0600 Subject: security_directory docs incorrect Message-ID: In http://www.erlang.org/doc/man/httpd.html, the docs for security directory properties look like this: {security_data_file, path()}... {security_max_retries, integer()}... {security_block_time, integer()}... {security_fail_expire_time, integer()}... {security_auth_timeout, integer()}... The values actually used are: {data_file, path()}... {max_retries, integer()}... {block_time, integer()}... {fail_expire_time, integer()} {auth_timeout, integer()}... See to mod_security.erl and mod_security_server.erl. Garrett From g@REDACTED Sat Mar 12 04:24:11 2011 From: g@REDACTED (Garrett Smith) Date: Fri, 11 Mar 2011 21:24:11 -0600 Subject: Undocumented path property required in security_directory Message-ID: mod_security relies on a 'path' property in a security_directory. If this property isn't available, you can't unblock blocked users. Refer to line 453 of mod_security_server.erl. E.g. a config like this: {security_directory, {"/", [{data_file, "security.dets"}]}}, will list undefined for the directory in the list of blocked users: [{"me",any,8080,undefined,{{2011,3,11},{22,8,3}}}] Config like this: {security_directory, {"/", [{path, "/"}, {data_file, "security.dets"}]}}, however, will work fine. I suspect that 'path' was supposed to be added implicitly to the DataDir proplist during the 'store' operation in mod_security, rather than require the user to explicitly configure it as in my example. Garrett From g@REDACTED Sat Mar 12 19:00:52 2011 From: g@REDACTED (Garrett Smith) Date: Sat, 12 Mar 2011 12:00:52 -0600 Subject: httpd.hrl not in canonical location Message-ID: The inets application doesn't have the canonical 'include' directory. Public include files like httpd.hrl are located under 'src'. This breaks typical usage: -include_lib("inets/include/httpd.hrl"). See docs in http://www.erlang.org/doc/man/httpd.html. Garrett From stonecypher@REDACTED Sun Mar 13 17:44:47 2011 From: stonecypher@REDACTED (John Haugeland) Date: Mon, 14 Mar 2011 00:44:47 +0800 Subject: hi ! Message-ID: Hi,what is up? my friend that works at chinese electronic corporation called "VIP-SHIMAO" tell me: their company is carrying out a promotion activity, there are phone,notebook,LCD TV and so on,not only all of goods are new and original ,but also the price so cheaper,they have received from customers high praise in the worldwide, today i get the digital camera that ordered,their service team are so excelent,shiping time take less than one week,i am very satisfied with their goods and service. now i share with you the good news,i believe that you can find what you need or like there : www.vip-shimao.info i trust that you will not despair and will get surprising,wish shopping happily!! From bgustavsson@REDACTED Mon Mar 14 17:03:19 2011 From: bgustavsson@REDACTED (=?UTF-8?Q?Bj=C3=B6rn_Gustavsson?=) Date: Mon, 14 Mar 2011 17:03:19 +0100 Subject: [erlang-bugs] Variable incorrectly unbound when bound and used in binary match In-Reply-To: <4D795BD7.4010708@m5net.com> References: <4D795BD7.4010708@m5net.com> Message-ID: On Fri, Mar 11, 2011 at 12:16 AM, Bernard Duggan wrote: > Reposting this (with slight modifications) that was posed in the questions > list. It seemed presumptuous to go straight to the bugs list, but the > consensus seems to be that that's what this is: > > So I've just run into an interesting little bit of behaviour that > doesn't seem quite right. In the following code: > ----------------------------------------- > -module(casetest). > > -export([test/0]). > > test() -> > match(<<1, 2, 3, 4, 5, 6, 7, 8>>). > > match(<>) -> > case A of > B -> wrong; > _ -> ok > end. [...] Thanks for reporting this bug. It is indeed a bug in the handling of variables in binary matching. It is too late to fix the bug in R14B02, so we will fix it in R14B03. -- Bj?rn Gustavsson, Erlang/OTP, Ericsson AB From igor@REDACTED Mon Mar 14 21:41:41 2011 From: igor@REDACTED (Igor Goryachev) Date: Mon, 14 Mar 2011 23:41:41 +0300 Subject: segmentation fault in tree_delete at beam/erl_bestfit_alloc.c:431 Message-ID: <871v29a0ze.fsf@goryachev.org> Hello. We are suffering of quite frequent segmentation faults on our erlangish environment. We run r14b01 node with a very small load on linux 2.6.32 (Debian GNU/Linux Squeeze 6.0), which is virtual machine hosted under OpenVZ hypervisor (16 cores, Xeon 2.40GHz). I've tried to rebuild erlang with and without smp and threads, but in any case I'm getting the same behaviour. What additional helpful information should I provide? Core was generated by `/usr/lib/erlang/erts-5.8.2/bin/beam -K true -- -root /usr/lib/erlang -progname'. Program terminated with signal 11, Segmentation fault. #0 0x0000000000437e83 in tree_delete (allctr=0x7cbf20, del=0x7f93a8267460) at beam/erl_bestfit_alloc.c:431 431 beam/erl_bestfit_alloc.c: No such file or directory. in beam/erl_bestfit_alloc.c (gdb) where #0 0x0000000000437e83 in tree_delete (allctr=0x7cbf20, del=0x7f93a8267460) at beam/erl_bestfit_alloc.c:431 #1 0x0000000000438bb2 in bf_unlink_free_block (allctr=0x7cbf20, size=, cand_blk=, cand_size=0) at beam/erl_bestfit_alloc.c:791 #2 bf_get_free_block (allctr=0x7cbf20, size=, cand_blk=, cand_size=0) at beam/erl_bestfit_alloc.c:842 #3 0x0000000000433506 in mbc_alloc_block (allctr=0x7cbf20, size=287) at beam/erl_alloc_util.c:631 #4 mbc_alloc (allctr=0x7cbf20, size=287) at beam/erl_alloc_util.c:764 #5 0x00000000004b8118 in erts_alloc (c_p=0x7f93a70a90e0, reg=, live=, build_size_term=, extra_words=140272158101112, unit=8) at beam/erl_alloc.h:184 #6 erts_bin_nrml_alloc (c_p=0x7f93a70a90e0, reg=, live=, build_size_term=, extra_words=140272158101112, unit=8) at beam/erl_binary.h:253 #7 erts_bs_append (c_p=0x7f93a70a90e0, reg=, live=, build_size_term=, extra_words=140272158101112, unit=8) at beam/erl_bits.c:1325 #8 0x00000000004e0a02 in process_main () at beam/beam_emu.c:3624 #9 0x000000000043c5eb in erl_start (argc=33, argv=) at beam/erl_init.c:1443 #10 0x0000000000427ac9 in main (argc=8175392, argv=0x7f93a8267460) at sys/unix/erl_main.c:29 -- Igor Goryachev From elinsn@REDACTED Tue Mar 15 08:20:26 2011 From: elinsn@REDACTED (Sergey Yelin) Date: Tue, 15 Mar 2011 10:20:26 +0300 Subject: Error in beam compiler (R14B01 on Widows XP) Message-ID: Hi list, I've found beam compiler error in R14B01. Here is a simple program that works fine on R14B (erts-5.8.1.1) but fail in R14B01 on the same environment (Windows XP SP3): -module(problem7). -export([find/2]). find(0, _) -> 0; find(Nth, Max) -> lists:nth(Nth, findp(Max, [], lists:seq(2, Max))). findp(_, _, []) -> []; findp(Max, P, [X | T]) when X*X =< Max -> P1 = [X] ++ P, findp(Max, P1, [N || N <- T, N rem X =/= 0]); findp(_, P, L) -> P ++ L. Output for R14B01 (in werl.exe): Erlang R14B01 (erts-5.8.2) [smp:4:4] [rq:4] [async-threads:0] Eshell V5.8.2 (abort with ^G) 1> c(problem7). ./problem7.erl:none: internal error in beam_asm; crash reason: {undef, [{beam_asm,module, [{problem7, [{find,2},{module_info,0},{module_info,1}], [], [{function,find,2,2, [{label,1}, {func_info,{atom,problem7},{atom,find},2}, {label,2}, {test,is_eq_exact,{f,3},[{x,0},{integer,0}]}, return, {label,3}, {allocate,2,2}, {move,{x,0},{y,1}}, {move,{integer,2},{x,0}}, {move,{x,1},{y,0}}, {call_ext,2,{extfunc,lists,seq,2}}, {move,nil,{x,1}}, {move,{x,0},{x,2}}, {move,{y,0},{x,0}}, {trim,1,1}, {call,3,{f,5}}, {move,{x,0},{x,1}}, {move,{y,0},{x,0}}, {call_ext_last,2,{extfunc,lists,nth,2},1}]}, {function,findp,3,5, [{label,4}, {func_info,{atom,problem7},{atom,findp},3}, {label,5}, {test,is_nonempty_list,{f,6},[{x,2}]}, {get_list,{x,2},{x,3},{x,4}}, {gc_bif,'*',{f,7},5,[{x,3},{x,3}],{x,5}}, {test,is_ge,{f,7},[{x,0},{x,5}]}, {allocate_heap,2,2,5}, {move,{x,0},{y,1}}, {put_list,{x,3},{x,1},{y,0}}, {move,{x,3},{x,1}}, {move,{x,4},{x,0}}, {call,2,{f,13}}, {move,{y,0},{x,1}}, {move,{x,0},{x,2}}, {move,{y,1},{x,0}}, {call_last,3,{f,5},2}, {label,6}, {test,is_nil,{f,7},[{x,2}]}, {move,nil,{x,0}}, return, {label,7}, {move,{x,1},{x,0}}, {move,{x,2},{x,1}}, {call_ext_only,2,{extfunc,erlang,'++',2}}]}, {function,module_info,0,9, [{label,8}, {func_info, {atom,problem7}, {atom,module_info}, 0}, {label,9}, {move,{atom,problem7},{x,0}}, {call_ext_only,1, {extfunc,erlang,get_module_info,1}}]}, {function,module_info,1,11, [{label,10}, {func_info, {atom,problem7}, {atom,module_info}, 1}, {label,11}, {move,{x,0},{x,1}}, {move,{atom,problem7},{x,0}}, {call_ext_only,2, {extfunc,erlang,get_module_info,2}}]}, {function,'-findp/3-lc$^0/1-0-',2,13, [{label,12}, {func_info, {atom,problem7}, {atom,'-findp/3-lc$^0/1-0-'}, 2}, {label,13}, {test,is_nonempty_list,{f,15},[{x,0}]}, {get_list,{x,0},{x,2},{x,3}}, {gc_bif,'rem',{f,14},4,[{x,2},{x,1}],{x,4}}, {test,is_ne_exact, {f,14}, [{x,4},{integer,0}]}, {allocate,1,4}, {move,{x,3},{x,0}}, {move,{x,2},{y,0}}, {call,2,{f,13}}, {test_heap,2,1}, {put_list,{y,0},{x,0},{x,0}}, {deallocate,1}, return, {label,14}, {move,{x,3},{x,0}}, {call_only,2,{f,13}}, {label,15}, {test,is_nil,{f,12},[{x,0}]}, return]}], 16}, [],"z:/Projects/myeuler/problem7.erl",[]]}, {compile,beam_asm,1}, {compile,'-internal_comp/4-anonymous-1-',2}, {compile,fold_comp,3}, {compile,internal_comp,4}, {compile,internal,3}]} error 2> From bgustavsson@REDACTED Tue Mar 15 10:08:14 2011 From: bgustavsson@REDACTED (=?UTF-8?Q?Bj=C3=B6rn_Gustavsson?=) Date: Tue, 15 Mar 2011 10:08:14 +0100 Subject: [erlang-bugs] Error in beam compiler (R14B01 on Widows XP) In-Reply-To: References: Message-ID: On Tue, Mar 15, 2011 at 8:20 AM, Sergey Yelin wrote: > Hi list, > > I've found beam compiler error in R14B01. I think you have some problems in your installation or environment. > Eshell V5.8.2 ?(abort with ^G) > 1> c(problem7). > ./problem7.erl:none: internal error in beam_asm; > crash reason: {undef, > ? ? ? ? ? ? ? ? ?[{beam_asm,module, This error message indicates that the beam_asm:module/4 function is undefined, either because the beam_asm module for some reason is missing or that you have your own version of the beam_asm module without a module/4 function. Can you compile any Erlang module? -- Bj?rn Gustavsson, Erlang/OTP, Ericsson AB From xramtsov@REDACTED Tue Mar 15 15:22:16 2011 From: xramtsov@REDACTED (Evgeniy Khramtsov) Date: Tue, 15 Mar 2011 23:22:16 +0900 Subject: send_timeout doesn't work Message-ID: <4D7F7618.4040305@gmail.com> It seems like there is a bug in send_timeout option of a TCP socket: the timeout is completely ignored (at least in active-once mode). The code to reproduce: http://kuku.jabber.ru/~xram/lock.erl Just compile it and start lock:listen() in one shell and lock:send() in another: over a time you will see that the receiving process is locked in prim_inet:send/3 and doesn't process current message in the mailbox. You can also play with PORT and SEND_TIMEOUT macros if needed. Versions tested: R13B02 and R14B01 (on Debian 2.6.32-5-amd64 SMP). -- Regards, Evgeniy Khramtsov, ProcessOne. xmpp:xram@REDACTED From per.melin@REDACTED Tue Mar 15 23:29:51 2011 From: per.melin@REDACTED (Per Melin) Date: Tue, 15 Mar 2011 23:29:51 +0100 Subject: reltool's app_file option Message-ID: The documentation lists 'keep', 'strip' and 'all' as valid values, but only 'keep' is allowed. The others give you an exit with "Illegal option: {app_file,all}". The following line in reltool_server.erl needs Val to be both 'strip' and 'all' simultaneously: app_file when Val =:= keep; Val =:= strip, Val =:= all -> In 0.5.3 (R13B04) and the dev branch. From hm@REDACTED Wed Mar 16 16:14:13 2011 From: hm@REDACTED (=?ISO-8859-1?Q?H=E5kan_Mattsson?=) Date: Wed, 16 Mar 2011 16:14:13 +0100 Subject: Broken Hipe in R14B02 Message-ID: I forgot to use --disable-hipe and got the following compilation error. Time to disable hipe by default? /H?kan $ uname -a Linux tellus 2.6.35-27-generic #48-Ubuntu SMP Tue Feb 22 20:25:46 UTC 2011 x86_64 GNU/Linux $ ./configure --prefix=/usr/local/pgm/otp_R14B02 --enable-halfword-emulator && make ... ... ... gcc -g -O3 -I/usr/local/src/otp_src_R14B02/erts/x86_64-unknown-linux-gnu -fno-tree-copyrename -D_GNU_SOURCE -DERTS_SMP -DHAVE_CONFIG_H -Wall -Wstrict-prototypes -Wmissing-prototypes -Wdeclarati on-after-statement -DUSE_THREADS -D_THREAD_SAFE -D_REENTRANT -DPOSIX_THREADS -D_POSIX_THREAD_SAFE_FUNCTIONS -Ix86_64-unknown-linux-gnu/opt/smp -Ibeam -Isys/unix -Isys/common -Ix86_64-unknown-linux -gnu -Izlib -Ipcre -Ihipe -I../include -I../include/x86_64-unknown-linux-gnu -I../include/internal -I../include/internal/x86_64-unknown-linux-gnu -c hipe/hipe_amd64.c -o obj/x86_64-unknown-linux-gnu/opt/smp/hipe_amd64.o hipe/hipe_amd64.c:40: warning: large integer implicitly truncated to unsigned type hipe/hipe_amd64.c:42: error: conflicting types for ?hipe_patch_load_fe? hipe/hipe_arch.h:28: note: previous declaration of ?hipe_patch_load_fe? was here hipe/hipe_amd64.c:49: error: conflicting types for ?hipe_patch_insn? hipe/hipe_arch.h:29: note: previous declaration of ?hipe_patch_insn? was here hipe/hipe_amd64.c: In function ?hipe_patch_call?: hipe/hipe_amd64.c:77: warning: cast from pointer to integer of different size hipe/hipe_amd64.c:77: warning: cast from pointer to integer of different size hipe/hipe_amd64.c: In function ?hipe_bifs_write_u64_2?: hipe/hipe_amd64.c:371: warning: passing argument 2 of ?term_to_Uint? from incompatible pointer type beam/big.h:152: note: expected ?Uint *? but argument is of type ?Uint64 *? make[3]: *** [obj/x86_64-unknown-linux-gnu/opt/smp/hipe_amd64.o] Error 1 make[3]: Leaving directory `/usr/local/src/otp_src_R14B02/erts/emulator' make[2]: *** [opt] Error 2 make[2]: Leaving directory `/usr/local/src/otp_src_R14B02/erts/emulator' : From sverker@REDACTED Wed Mar 16 16:31:06 2011 From: sverker@REDACTED (Sverker Eriksson) Date: Wed, 16 Mar 2011 16:31:06 +0100 Subject: [erlang-bugs] Broken Hipe in R14B02 In-Reply-To: References: Message-ID: <4D80D7BA.4080502@erix.ericsson.se> Hipe and halfword emulator do not play nice together yet. Any problems without --enable-halfword-emulator? /Sverker, Erlang/OTP H?kan Mattsson wrote: > I forgot to use --disable-hipe and got the following compilation error. > > Time to disable hipe by default? > > /H?kan > > $ uname -a > Linux tellus 2.6.35-27-generic #48-Ubuntu SMP Tue Feb 22 20:25:46 UTC > 2011 x86_64 GNU/Linux > $ ./configure --prefix=/usr/local/pgm/otp_R14B02 > --enable-halfword-emulator && make > ... > ... > ... > gcc -g -O3 -I/usr/local/src/otp_src_R14B02/erts/x86_64-unknown-linux-gnu > -fno-tree-copyrename -D_GNU_SOURCE -DERTS_SMP -DHAVE_CONFIG_H -Wall > -Wstrict-prototypes -Wmissing-prototypes -Wdeclarati > on-after-statement -DUSE_THREADS -D_THREAD_SAFE -D_REENTRANT > -DPOSIX_THREADS -D_POSIX_THREAD_SAFE_FUNCTIONS > -Ix86_64-unknown-linux-gnu/opt/smp -Ibeam -Isys/unix -Isys/common > -Ix86_64-unknown-linux > -gnu -Izlib -Ipcre -Ihipe -I../include > -I../include/x86_64-unknown-linux-gnu -I../include/internal > -I../include/internal/x86_64-unknown-linux-gnu -c hipe/hipe_amd64.c -o > obj/x86_64-unknown-linux-gnu/opt/smp/hipe_amd64.o > hipe/hipe_amd64.c:40: warning: large integer implicitly truncated to > unsigned type > hipe/hipe_amd64.c:42: error: conflicting types for ?hipe_patch_load_fe? > hipe/hipe_arch.h:28: note: previous declaration of ?hipe_patch_load_fe? was here > hipe/hipe_amd64.c:49: error: conflicting types for ?hipe_patch_insn? > hipe/hipe_arch.h:29: note: previous declaration of ?hipe_patch_insn? was here > hipe/hipe_amd64.c: In function ?hipe_patch_call?: > hipe/hipe_amd64.c:77: warning: cast from pointer to integer of different size > hipe/hipe_amd64.c:77: warning: cast from pointer to integer of different size > hipe/hipe_amd64.c: In function ?hipe_bifs_write_u64_2?: > hipe/hipe_amd64.c:371: warning: passing argument 2 of ?term_to_Uint? > from incompatible pointer type > beam/big.h:152: note: expected ?Uint *? but argument is of type ?Uint64 *? > make[3]: *** [obj/x86_64-unknown-linux-gnu/opt/smp/hipe_amd64.o] Error 1 > make[3]: Leaving directory `/usr/local/src/otp_src_R14B02/erts/emulator' > make[2]: *** [opt] Error 2 > make[2]: Leaving directory `/usr/local/src/otp_src_R14B02/erts/emulator' > : > > ________________________________________________________________ > erlang-bugs (at) erlang.org mailing list. > See http://www.erlang.org/faq.html > To unsubscribe; mailto:erlang-bugs-unsubscribe@REDACTED > > > From boris.muehmer@REDACTED Wed Mar 16 16:51:00 2011 From: boris.muehmer@REDACTED (=?UTF-8?Q?Boris_M=C3=BChmer?=) Date: Wed, 16 Mar 2011 16:51:00 +0100 Subject: R14B02: "make install-docs" fails on Ubuntu 10.04 / 10.10 (using the source tar-ball) Message-ID: "make install-docs" fails on Ubuntu 10.04 / 10.10 systems using the R14B02 source tar-ball (like in "R14B01"). The simple fix for "$ERL_TOP/Makefile.in" (or "$ERL_TOP/Makefile") would be to change line 412 from $(ERL_TOP)/lib/erl_docgen/priv/bin/xref_mod_app.escript -topdir $(ERL_TOP) -outfile $(ERL_TOP)/make/$(TARGET)/mod2app.xml to $(ERL_TOP)/bin/escript $(ERL_TOP)/lib/erl_docgen/priv/bin/xref_mod_app.escript -topdir $(ERL_TOP) -outfile $(ERL_TOP)/make/$(TARGET)/mod2app.xml - boris From andrew@REDACTED Wed Mar 16 19:15:56 2011 From: andrew@REDACTED (Andrew Thompson) Date: Wed, 16 Mar 2011 14:15:56 -0400 Subject: [erlang-bugs] R14B02: "make install-docs" fails on Ubuntu 10.04 / 10.10 (using the source tar-ball) In-Reply-To: References: Message-ID: <20110316181555.GL6177@hijacked.us> On Wed, Mar 16, 2011 at 04:51:00PM +0100, Boris M??hmer wrote: > "make install-docs" fails on Ubuntu 10.04 / 10.10 systems using the > R14B02 source tar-ball (like in "R14B01"). > > The simple fix for "$ERL_TOP/Makefile.in" (or "$ERL_TOP/Makefile") > would be to change line 412 from > $(ERL_TOP)/lib/erl_docgen/priv/bin/xref_mod_app.escript > -topdir $(ERL_TOP) -outfile $(ERL_TOP)/make/$(TARGET)/mod2app.xml > to > $(ERL_TOP)/bin/escript > $(ERL_TOP)/lib/erl_docgen/priv/bin/xref_mod_app.escript -topdir > $(ERL_TOP) -outfile $(ERL_TOP)/make/$(TARGET)/mod2app.xml > Came here to report this. Boris' fix seems to solve the issue but then I get this error: === Entering application common_test make RELEASE_PATH=/usr/local/lib/erlang release_docs_spec escript /Users/andrew/otp_src_R14B02/lib/erl_docgen/priv/bin/xml_from_edoc.escript -preprocess true -i /include \ -i ../../../test_server/include -i ../../include \ -i ../../../../erts/lib/kernel/include -i ../../../../lib/kernel/include \ -i ../../../../erts/lib/snmp/include -i ../../../../lib/snmp/include ../../src/ct.erl escript: exception error: undefined function edoc:file/2 in function erl_eval:local_func/5 in call from escript:interpret/4 in call from escript:start/1 in call from init:start_it/1 in call from init:start_em/1 make[5]: *** [ct.xml] Error 127 make[4]: *** [release_docs] Error 2 make[3]: *** [release_docs] Error 2 make[2]: *** [release_docs] Error 2 make[1]: *** [release_docs] Error 2 make: *** [install-docs] Error 2 Looks like a similar problem, except that this escript is being invoked via 'escript' Changing 'escript' in the Makefile to '$(OTP_TOP)/escript' seems to fix it, but its wrong in a bunch of the doc Makefiles. Andrew From andrew@REDACTED Wed Mar 16 19:21:48 2011 From: andrew@REDACTED (Andrew Thompson) Date: Wed, 16 Mar 2011 14:21:48 -0400 Subject: [erlang-bugs] R14B02: "make install-docs" fails on Ubuntu 10.04 / 10.10 (using the source tar-ball) In-Reply-To: <20110316181555.GL6177@hijacked.us> References: <20110316181555.GL6177@hijacked.us> Message-ID: <20110316182147.GM6177@hijacked.us> On Wed, Mar 16, 2011 at 02:15:56PM -0400, Andrew Thompson wrote: > Looks like a similar problem, except that this escript is being invoked > via 'escript' Changing 'escript' in the Makefile to '$(OTP_TOP)/escript' > seems to fix it, but its wrong in a bunch of the doc Makefiles. > That should be $(ERL_TOP)/bin/escript, obviously. Andrew From lukas@REDACTED Thu Mar 17 10:38:17 2011 From: lukas@REDACTED (Lukas Larsson) Date: Thu, 17 Mar 2011 10:38:17 +0100 Subject: [erlang-bugs] R14B02: "make install-docs" fails on Ubuntu 10.04 / 10.10 (using the source tar-ball) In-Reply-To: <20110316181555.GL6177@hijacked.us> References: <20110316181555.GL6177@hijacked.us> Message-ID: <1300354697.2336.37.camel@bilbo> Hi Boris and Andrew! Thanks for pointing this out. It has to do with the fact that an old version of escript (and in extension the erlang VM) is used to build the docs. A workaround for now is to add /your/r14b02/path/bin/ into your PATH and then build the docs. I'm however unsure if the patch you provided will work for us as it might not always be true that one would want to use the $ERL_TOP/bin/escript emulator to build the docs. I'll try to come up with a solutions which works for both scenarios. Lukas On Wed, 2011-03-16 at 14:15 -0400, Andrew Thompson wrote: > On Wed, Mar 16, 2011 at 04:51:00PM +0100, Boris M??hmer wrote: > > "make install-docs" fails on Ubuntu 10.04 / 10.10 systems using the > > R14B02 source tar-ball (like in "R14B01"). > > > > The simple fix for "$ERL_TOP/Makefile.in" (or "$ERL_TOP/Makefile") > > would be to change line 412 from > > $(ERL_TOP)/lib/erl_docgen/priv/bin/xref_mod_app.escript > > -topdir $(ERL_TOP) -outfile $(ERL_TOP)/make/$(TARGET)/mod2app.xml > > to > > $(ERL_TOP)/bin/escript > > $(ERL_TOP)/lib/erl_docgen/priv/bin/xref_mod_app.escript -topdir > > $(ERL_TOP) -outfile $(ERL_TOP)/make/$(TARGET)/mod2app.xml > > > > Came here to report this. Boris' fix seems to solve the issue but then I > get this error: > > === Entering application common_test > make RELEASE_PATH=/usr/local/lib/erlang release_docs_spec > escript > /Users/andrew/otp_src_R14B02/lib/erl_docgen/priv/bin/xml_from_edoc.escript > -preprocess true -i /include \ > -i ../../../test_server/include -i ../../include \ > -i ../../../../erts/lib/kernel/include -i > ../../../../lib/kernel/include \ > -i ../../../../erts/lib/snmp/include -i ../../../../lib/snmp/include > ../../src/ct.erl > escript: exception error: undefined function edoc:file/2 > in function erl_eval:local_func/5 > in call from escript:interpret/4 > in call from escript:start/1 > in call from init:start_it/1 > in call from init:start_em/1 > make[5]: *** [ct.xml] Error 127 > make[4]: *** [release_docs] Error 2 > make[3]: *** [release_docs] Error 2 > make[2]: *** [release_docs] Error 2 > make[1]: *** [release_docs] Error 2 > make: *** [install-docs] Error 2 > > Looks like a similar problem, except that this escript is being invoked > via 'escript' Changing 'escript' in the Makefile to '$(OTP_TOP)/escript' > seems to fix it, but its wrong in a bunch of the doc Makefiles. > > Andrew > > ________________________________________________________________ > erlang-bugs (at) erlang.org mailing list. > See http://www.erlang.org/faq.html > To unsubscribe; mailto:erlang-bugs-unsubscribe@REDACTED From boris.muehmer@REDACTED Thu Mar 17 11:04:08 2011 From: boris.muehmer@REDACTED (=?UTF-8?Q?Boris_M=C3=BChmer?=) Date: Thu, 17 Mar 2011 11:04:08 +0100 Subject: [erlang-bugs] R14B02: "make install-docs" fails on Ubuntu 10.04 / 10.10 (using the source tar-ball) In-Reply-To: <1300354697.2336.37.camel@bilbo> References: <20110316181555.GL6177@hijacked.us> <1300354697.2336.37.camel@bilbo> Message-ID: The "funny" thing was (this was also true when I wrote about it concerning the R14B01 release), that adding the path wasn't enough. My normal procedure for installing from source is: 1> tar xvf 2> cd 3> export LANG=C 4> export ERL_TOP="`pwd`" 5> export PATH="$ERL_TOP/bin:$PATH" 6> ./configure --prefix= 7> ( make all && make install && make docs && make install-docs ) 2>&1 | tee log-build.txt With step 5 the "right" escript should be in the path, but without patching the makefile/makefie.in "make install-docs" does fail on "my" systems. Currently I don't understand why "env" fails to locate the right "escript" from the PATH. Besides: there is neither an Erlang installation from the Ubuntu repositories on my systems, nor is another Erlang installation bin-directory in my PATH. - boris 2011/3/17 Lukas Larsson : > Hi Boris and Andrew! > > Thanks for pointing this out. It has to do with the fact that an old > version of escript (and in extension the erlang VM) is used to build the > docs. A workaround for now is to add /your/r14b02/path/bin/ into your > PATH and then build the docs. > > I'm however unsure if the patch you provided will work for us as it > might not always be true that one would want to use the > $ERL_TOP/bin/escript emulator to build the docs. I'll try to come up > with a solutions which works for both scenarios. > > Lukas > > On Wed, 2011-03-16 at 14:15 -0400, Andrew Thompson wrote: >> On Wed, Mar 16, 2011 at 04:51:00PM +0100, Boris M??hmer wrote: >> > "make install-docs" fails on Ubuntu 10.04 / 10.10 systems using the >> > R14B02 source tar-ball (like in "R14B01"). >> > >> > The simple fix for "$ERL_TOP/Makefile.in" (or "$ERL_TOP/Makefile") >> > would be to change line 412 from >> > ? ? ? ? $(ERL_TOP)/lib/erl_docgen/priv/bin/xref_mod_app.escript >> > -topdir $(ERL_TOP) -outfile $(ERL_TOP)/make/$(TARGET)/mod2app.xml >> > to >> > ? ? ? ? $(ERL_TOP)/bin/escript >> > $(ERL_TOP)/lib/erl_docgen/priv/bin/xref_mod_app.escript -topdir >> > $(ERL_TOP) -outfile $(ERL_TOP)/make/$(TARGET)/mod2app.xml >> > >> >> Came here to report this. Boris' fix seems to solve the issue but then I >> get this error: >> >> === Entering application common_test >> make ?RELEASE_PATH=/usr/local/lib/erlang ? release_docs_spec >> escript >> /Users/andrew/otp_src_R14B02/lib/erl_docgen/priv/bin/xml_from_edoc.escript >> -preprocess true -i /include \ >> ? ? ? ? ? ? ? -i ../../../test_server/include -i ?../../include \ >> ? ? ? ? ? ? ? -i ../../../../erts/lib/kernel/include -i >> ../../../../lib/kernel/include \ >> ? ? ? ? ? ? ? -i ../../../../erts/lib/snmp/include -i ../../../../lib/snmp/include >> ../../src/ct.erl >> escript: exception error: undefined function edoc:file/2 >> ? in function ?erl_eval:local_func/5 >> ? in call from escript:interpret/4 >> ? in call from escript:start/1 >> ? in call from init:start_it/1 >> ? in call from init:start_em/1 >> make[5]: *** [ct.xml] Error 127 >> make[4]: *** [release_docs] Error 2 >> make[3]: *** [release_docs] Error 2 >> make[2]: *** [release_docs] Error 2 >> make[1]: *** [release_docs] Error 2 >> make: *** [install-docs] Error 2 >> >> Looks like a similar problem, except that this escript is being invoked >> via 'escript' Changing 'escript' in the Makefile to '$(OTP_TOP)/escript' >> seems to fix it, but its wrong in a bunch of the doc Makefiles. >> >> Andrew >> >> ________________________________________________________________ >> erlang-bugs (at) erlang.org mailing list. >> See http://www.erlang.org/faq.html >> To unsubscribe; mailto:erlang-bugs-unsubscribe@REDACTED > > > From lukas@REDACTED Thu Mar 17 11:15:49 2011 From: lukas@REDACTED (Lukas Larsson) Date: Thu, 17 Mar 2011 11:15:49 +0100 Subject: [erlang-bugs] R14B02: "make install-docs" fails on Ubuntu 10.04 / 10.10 (using the source tar-ball) In-Reply-To: References: <20110316181555.GL6177@hijacked.us> <1300354697.2336.37.camel@bilbo> Message-ID: <1300356950.2336.57.camel@bilbo> Ah, ok. Then the workaround only works for Andrews problem, but not for yours. Adding escript (without the $(ERL_TOP)/bin) before $(ERL_TOP)/lib/erl_docgen/priv/bin/xref_mod_app.escript -topdir $(ERL_TOP) -outfile $(ERL_TOP)/make/$(TARGET)/mod2app.xml should do the trick as long as you have the latest version of erlang in your path while building. That will probably have to be good enough for now, unless someone comes up with a solution which addresses the problem with configuring which vm to use. Lukas On Thu, 2011-03-17 at 11:04 +0100, Boris M?hmer wrote: > The "funny" thing was (this was also true when I wrote about it concerning > the R14B01 release), that adding the path wasn't enough. > > My normal procedure for installing from source is: > > 1> tar xvf > 2> cd > 3> export LANG=C > 4> export ERL_TOP="`pwd`" > 5> export PATH="$ERL_TOP/bin:$PATH" > 6> ./configure --prefix= > 7> ( make all && make install && make docs && make install-docs ) > 2>&1 | tee log-build.txt > > With step 5 the "right" escript should be in the path, but without patching the > makefile/makefie.in "make install-docs" does fail on "my" systems. > > Currently I don't understand why "env" fails to locate the right "escript" > from the PATH. > > Besides: there is neither an Erlang installation from the Ubuntu repositories > on my systems, nor is another Erlang installation bin-directory in my PATH. > > > - boris > > > 2011/3/17 Lukas Larsson : > > Hi Boris and Andrew! > > > > Thanks for pointing this out. It has to do with the fact that an old > > version of escript (and in extension the erlang VM) is used to build the > > docs. A workaround for now is to add /your/r14b02/path/bin/ into your > > PATH and then build the docs. > > > > I'm however unsure if the patch you provided will work for us as it > > might not always be true that one would want to use the > > $ERL_TOP/bin/escript emulator to build the docs. I'll try to come up > > with a solutions which works for both scenarios. > > > > Lukas > > > > On Wed, 2011-03-16 at 14:15 -0400, Andrew Thompson wrote: > >> On Wed, Mar 16, 2011 at 04:51:00PM +0100, Boris M??hmer wrote: > >> > "make install-docs" fails on Ubuntu 10.04 / 10.10 systems using the > >> > R14B02 source tar-ball (like in "R14B01"). > >> > > >> > The simple fix for "$ERL_TOP/Makefile.in" (or "$ERL_TOP/Makefile") > >> > would be to change line 412 from > >> > $(ERL_TOP)/lib/erl_docgen/priv/bin/xref_mod_app.escript > >> > -topdir $(ERL_TOP) -outfile $(ERL_TOP)/make/$(TARGET)/mod2app.xml > >> > to > >> > $(ERL_TOP)/bin/escript > >> > $(ERL_TOP)/lib/erl_docgen/priv/bin/xref_mod_app.escript -topdir > >> > $(ERL_TOP) -outfile $(ERL_TOP)/make/$(TARGET)/mod2app.xml > >> > > >> > >> Came here to report this. Boris' fix seems to solve the issue but then I > >> get this error: > >> > >> === Entering application common_test > >> make RELEASE_PATH=/usr/local/lib/erlang release_docs_spec > >> escript > >> /Users/andrew/otp_src_R14B02/lib/erl_docgen/priv/bin/xml_from_edoc.escript > >> -preprocess true -i /include \ > >> -i ../../../test_server/include -i ../../include \ > >> -i ../../../../erts/lib/kernel/include -i > >> ../../../../lib/kernel/include \ > >> -i ../../../../erts/lib/snmp/include -i ../../../../lib/snmp/include > >> ../../src/ct.erl > >> escript: exception error: undefined function edoc:file/2 > >> in function erl_eval:local_func/5 > >> in call from escript:interpret/4 > >> in call from escript:start/1 > >> in call from init:start_it/1 > >> in call from init:start_em/1 > >> make[5]: *** [ct.xml] Error 127 > >> make[4]: *** [release_docs] Error 2 > >> make[3]: *** [release_docs] Error 2 > >> make[2]: *** [release_docs] Error 2 > >> make[1]: *** [release_docs] Error 2 > >> make: *** [install-docs] Error 2 > >> > >> Looks like a similar problem, except that this escript is being invoked > >> via 'escript' Changing 'escript' in the Makefile to '$(OTP_TOP)/escript' > >> seems to fix it, but its wrong in a bunch of the doc Makefiles. > >> > >> Andrew > >> > >> ________________________________________________________________ > >> erlang-bugs (at) erlang.org mailing list. > >> See http://www.erlang.org/faq.html > >> To unsubscribe; mailto:erlang-bugs-unsubscribe@REDACTED > > > > > > > > ________________________________________________________________ > erlang-bugs (at) erlang.org mailing list. > See http://www.erlang.org/faq.html > To unsubscribe; mailto:erlang-bugs-unsubscribe@REDACTED > From mikpe@REDACTED Thu Mar 17 11:43:36 2011 From: mikpe@REDACTED (Mikael Pettersson) Date: Thu, 17 Mar 2011 11:43:36 +0100 Subject: [erlang-bugs] Broken Hipe in R14B02 In-Reply-To: <4D80D7BA.4080502@erix.ericsson.se> References: <4D80D7BA.4080502@erix.ericsson.se> Message-ID: <19841.58840.701872.705972@pilspetsen.it.uu.se> On Wed, 16 Mar 2011 16:31:06 +0100, Sverker Eriksson wrote: > Hipe and halfword emulator do not play nice together yet. > > Any problems without --enable-halfword-emulator? > > /Sverker, Erlang/OTP > > > H=E5kan Mattsson wrote: > > I forgot to use --disable-hipe and got the following compilation error.= > > > > > Time to disable hipe by default? > > > > /H=E5kan > > > > $ uname -a > > Linux tellus 2.6.35-27-generic #48-Ubuntu SMP Tue Feb 22 20:25:46 UTC > > 2011 x86_64 GNU/Linux > > $ ./configure --prefix=3D/usr/local/pgm/otp_R14B02 > > --enable-halfword-emulator && make > > ... > > ... > > ... > > gcc -g -O3 -I/usr/local/src/otp_src_R14B02/erts/x86_64-unknown-linux-g= > nu > > -fno-tree-copyrename -D_GNU_SOURCE -DERTS_SMP -DHAVE_CONFIG_H -Wall > > -Wstrict-prototypes -Wmissing-prototypes -Wdeclarati > > on-after-statement -DUSE_THREADS -D_THREAD_SAFE -D_REENTRANT > > -DPOSIX_THREADS -D_POSIX_THREAD_SAFE_FUNCTIONS > > -Ix86_64-unknown-linux-gnu/opt/smp -Ibeam -Isys/unix -Isys/common > > -Ix86_64-unknown-linux > > -gnu -Izlib -Ipcre -Ihipe -I../include > > -I../include/x86_64-unknown-linux-gnu -I../include/internal > > -I../include/internal/x86_64-unknown-linux-gnu -c hipe/hipe_amd64.c -o > > obj/x86_64-unknown-linux-gnu/opt/smp/hipe_amd64.o > > hipe/hipe_amd64.c:40: warning: large integer implicitly truncated to > > unsigned type > > hipe/hipe_amd64.c:42: error: conflicting types for =91hipe_patch_load_f= > e=92 > > hipe/hipe_arch.h:28: note: previous declaration of =91hipe_patch_load_f= > e=92 was here > > hipe/hipe_amd64.c:49: error: conflicting types for =91hipe_patch_insn=92= > > > hipe/hipe_arch.h:29: note: previous declaration of =91hipe_patch_insn=92= > was here > > hipe/hipe_amd64.c: In function =91hipe_patch_call=92: > > hipe/hipe_amd64.c:77: warning: cast from pointer to integer of differen= > t size > > hipe/hipe_amd64.c:77: warning: cast from pointer to integer of differen= > t size > > hipe/hipe_amd64.c: In function =91hipe_bifs_write_u64_2=92: > > hipe/hipe_amd64.c:371: warning: passing argument 2 of =91term_to_Uint=92= > > > from incompatible pointer type > > beam/big.h:152: note: expected =91Uint *=92 but argument is of type =91= > Uint64 *=92 > > make[3]: *** [obj/x86_64-unknown-linux-gnu/opt/smp/hipe_amd64.o] Error = > 1 > > make[3]: Leaving directory `/usr/local/src/otp_src_R14B02/erts/emulator= > ' > > make[2]: *** [opt] Error 2 > > make[2]: Leaving directory `/usr/local/src/otp_src_R14B02/erts/emulator= > ' > > : The halfword emulator is a new beast with an execution mode that differs significantly from existing 32- and 64-bit modes. The build error in the runtime code you got is just the tip of the iceberg, non-trivial changes to the compiler would be required to support the halfword emulator. For now the best solution is to auto-disable HiPE if halfword emulator is enabled, and error out if both are explicitly enabled. (If someone wants to fund the development of HiPE support for halfword emulator on Linux/AMD64, contact me offline.) /Mikael From sverker@REDACTED Thu Mar 17 12:13:29 2011 From: sverker@REDACTED (Sverker Eriksson) Date: Thu, 17 Mar 2011 12:13:29 +0100 Subject: [erlang-bugs] Broken Hipe in R14B02 In-Reply-To: <19841.58840.701872.705972@pilspetsen.it.uu.se> References: <4D80D7BA.4080502@erix.ericsson.se> <19841.58840.701872.705972@pilspetsen.it.uu.se> Message-ID: <4D81ECD9.6040103@erix.ericsson.se> Mikael Pettersson wrote: > For now the best solution is to auto-disable HiPE if halfword emulator > is enabled, and error out if both are explicitly enabled. > Agree. It will turn up in dev branch for next release. /Sverker, Erlang/OTP From eric.pailleau@REDACTED Thu Mar 17 22:25:07 2011 From: eric.pailleau@REDACTED (PAILLEAU Eric) Date: Thu, 17 Mar 2011 22:25:07 +0100 Subject: [erlang-bugs] wx undefined symbol In-Reply-To: <4D6D5807.4020408@pailleau.org> References: <4D5AF2D8.4070105@wanadoo.fr> <4D5AF612.3010701@wanadoo.fr> <4D6D5807.4020408@pailleau.org> Message-ID: <4D827C33.8060404@wanadoo.fr> Hi, I tried with the last R14B02, and WX is working without changing anything else (?!). I do not see in the readme file what could have solve my problem, but anyway, it works now. I just got some annoying outputs in the erl shell : (Erlang:12555): Gtk-WARNING **: gtk_widget_size_allocate(): attempt to allocate widget with width -5 and height 17 I got this by playing with wx:demo(). Thanks to Dan Gudmundsson for his help, even if I did not have the time to try his tips. regards. From pan@REDACTED Fri Mar 18 10:49:16 2011 From: pan@REDACTED (pan@REDACTED) Date: Fri, 18 Mar 2011 10:49:16 +0100 Subject: [erlang-bugs] segmentation fault in tree_delete at beam/erl_bestfit_alloc.c:431 In-Reply-To: <871v29a0ze.fsf@goryachev.org> References: <871v29a0ze.fsf@goryachev.org> Message-ID: Hi Igor! Sadly enough, this is the worst kind of core you could ever have :( The core is generated in the allocators, but that's most probably not the allocators fault. Something has written outside of an allocated area earlier and now the error shows up in some (possibly/probaly) unrelated place. First of all, I have to ask if you have some non-OTP drivers or NIF's loaded in the VM? Have you loaded some native code not supplied in the Erlang distribution? In that case, try to rule out errors in that code and in libraries loaded by that code by e.g. disabling it in some way (write slower erlang-replacements etc). Next question is if you use some drivers or NIF's provided by us that pull third party libraries, like Wx oc Crypto (by using SSL etc). If we could isolate the problem to a driver (our's or your's) the searchspace would be greatly reduced. Also, looking at the core locally would possibly help me to identify the type of data that has been written into the block, which possibly could narrow it down, so if you could tar your compiled build tree and the core and put it on something where I can fetch it (mail me personally with the details, if you can do that), that would be helpful. If the workload is low, running the VM under Valgrind, would probably be feasible. There is a special valgrind target when doing make in the $ERL_TOP/erts/emulator directory, you can do 'make FLAVOR=smp valgrind' if you have valgrind 3.4 or higher installed on the system. Running cerl -valgrind (from the $ERL_TOP/bin directory) would then start erlang in the valgrind virtual environment, which should point out any illegal memory accesses (note that some warnings are expected, namely a lot of PossiblyLost, which is due to us keeping pointers *into* structures instead of to the beginning of the structures). Another possibility is to compile all C code with -D_FORTIFY_SOURCE, which may find faulty memory accesses too. You say this is frequent. Is it in any way manually reproducable? Have you got any idea of which erlang-code is run when this happens (i.e. during some special kind of workload)? One possibility is that this is a compiler error (in our compiler that is), so a module triggering the proble m would also be interesting. Please make sure to run R14B02 and recompile all erlang code with the latest Erlang version to rule out any bug that's already corrected :) Sorry for the big fluffy list of options, but as I said, this is a kind of error that is really hard to track down... Cheers, /Patrik On Mon, 14 Mar 2011, Igor Goryachev wrote: > Hello. > > We are suffering of quite frequent segmentation faults on our erlangish > environment. We run r14b01 node with a very small load on linux 2.6.32 > (Debian GNU/Linux Squeeze 6.0), which is virtual machine hosted under > OpenVZ hypervisor (16 cores, Xeon 2.40GHz). > > I've tried to rebuild erlang with and without smp and threads, but in any > case I'm getting the same behaviour. > > What additional helpful information should I provide? > > > Core was generated by `/usr/lib/erlang/erts-5.8.2/bin/beam -K true -- -root /usr/lib/erlang -progname'. > Program terminated with signal 11, Segmentation fault. > #0 0x0000000000437e83 in tree_delete (allctr=0x7cbf20, del=0x7f93a8267460) at beam/erl_bestfit_alloc.c:431 > 431 beam/erl_bestfit_alloc.c: No such file or directory. > in beam/erl_bestfit_alloc.c > (gdb) where > #0 0x0000000000437e83 in tree_delete (allctr=0x7cbf20, del=0x7f93a8267460) at beam/erl_bestfit_alloc.c:431 > #1 0x0000000000438bb2 in bf_unlink_free_block (allctr=0x7cbf20, size=, cand_blk=, > cand_size=0) at beam/erl_bestfit_alloc.c:791 > #2 bf_get_free_block (allctr=0x7cbf20, size=, cand_blk=, cand_size=0) > at beam/erl_bestfit_alloc.c:842 > #3 0x0000000000433506 in mbc_alloc_block (allctr=0x7cbf20, size=287) at beam/erl_alloc_util.c:631 > #4 mbc_alloc (allctr=0x7cbf20, size=287) at beam/erl_alloc_util.c:764 > #5 0x00000000004b8118 in erts_alloc (c_p=0x7f93a70a90e0, reg=, live=, > build_size_term=, extra_words=140272158101112, unit=8) at beam/erl_alloc.h:184 > #6 erts_bin_nrml_alloc (c_p=0x7f93a70a90e0, reg=, live=, > build_size_term=, extra_words=140272158101112, unit=8) at beam/erl_binary.h:253 > #7 erts_bs_append (c_p=0x7f93a70a90e0, reg=, live=, build_size_term=, > extra_words=140272158101112, unit=8) at beam/erl_bits.c:1325 > #8 0x00000000004e0a02 in process_main () at beam/beam_emu.c:3624 > #9 0x000000000043c5eb in erl_start (argc=33, argv=) at beam/erl_init.c:1443 > #10 0x0000000000427ac9 in main (argc=8175392, argv=0x7f93a8267460) at sys/unix/erl_main.c:29 > > > -- > Igor Goryachev > > ________________________________________________________________ > erlang-bugs (at) erlang.org mailing list. > See http://www.erlang.org/faq.html > To unsubscribe; mailto:erlang-bugs-unsubscribe@REDACTED > From kwidoyo@REDACTED Fri Mar 18 11:03:43 2011 From: kwidoyo@REDACTED (Kustarto Widoyo) Date: Fri, 18 Mar 2011 19:03:43 +0900 Subject: [erlang-bugs] Segmentation fault in erts_printf_char (R13B04) In-Reply-To: References: <4D4BA2F3.2080405@geminimobile.com> <4D75E41C.1080502@geminimobile.com> Message-ID: <4D832DFF.8000008@geminimobile.com> Patrik did investigation for this issue, and as fyi, the following was his comment. > The problem encountered is that the internal printf gets a really deep > structure to format for the call erlag:system_info(procs), called from a > fun declared in gmt_cinfo_basic:erlang_system_info. The call formats a > binary with debug information for every process in this huge system, of > which one has this reeeeally deep list structure. > > I shall limit the depth of the output in the erts_printf call, but you > should really not call erlang:system_info(procs) in such a huge system. > Using that kind of debug functionality in the system will cost *a lot* > in terms of memory and CPU. So a workaround for this would be to disable > this dumping of process debug information in your system. Thank you very much Patrik. Widoyo From pan@REDACTED Fri Mar 18 11:24:33 2011 From: pan@REDACTED (pan@REDACTED) Date: Fri, 18 Mar 2011 11:24:33 +0100 Subject: [erlang-bugs] Segmentation fault in erts_printf_char (R13B04) In-Reply-To: <4D832DFF.8000008@geminimobile.com> References: <4D4BA2F3.2080405@geminimobile.com> <4D75E41C.1080502@geminimobile.com> <4D832DFF.8000008@geminimobile.com> Message-ID: Hi! A correction to erts_printf, that makes it not recurse on the C stack any more, is on it's way. Expect to see it in the GitHub dev branch in a few days. On Fri, 18 Mar 2011, Kustarto Widoyo wrote: > Patrik did investigation for this issue, and as fyi, the following was his > comment. > >> The problem encountered is that the internal printf gets a really deep >> structure to format for the call erlag:system_info(procs), called from a >> fun declared in gmt_cinfo_basic:erlang_system_info. The call formats a >> binary with debug information for every process in this huge system, of >> which one has this reeeeally deep list structure. >> >> I shall limit the depth of the output in the erts_printf call, but you >> should really not call erlang:system_info(procs) in such a huge system. >> Using that kind of debug functionality in the system will cost *a lot* >> in terms of memory and CPU. So a workaround for this would be to disable >> this dumping of process debug information in your system. > > Thank you very much Patrik. Thank you for the help in tracking this down! > > Widoyo Cheers, /Patrik > > ________________________________________________________________ > erlang-bugs (at) erlang.org mailing list. > See http://www.erlang.org/faq.html > To unsubscribe; mailto:erlang-bugs-unsubscribe@REDACTED > From xramtsov@REDACTED Mon Mar 21 07:34:39 2011 From: xramtsov@REDACTED (Evgeniy Khramtsov) Date: Mon, 21 Mar 2011 15:34:39 +0900 Subject: send_timeout doesn't work In-Reply-To: <4D7F7618.4040305@gmail.com> References: <4D7F7618.4040305@gmail.com> Message-ID: <4D86F17F.6060702@gmail.com> 15.03.2011 23:22, Evgeniy Khramtsov wrote: > It seems like there is a bug in send_timeout option of a TCP socket: > the timeout is completely ignored (at least in active-once mode). > The code to reproduce: http://kuku.jabber.ru/~xram/lock.erl > Just compile it and start lock:listen() in one shell and lock:send() > in another: over a time you will see that the receiving process is > locked in prim_inet:send/3 and doesn't process current message in the > mailbox. You can also play with PORT and SEND_TIMEOUT macros if needed. > > Versions tested: R13B02 and R14B01 (on Debian 2.6.32-5-amd64 SMP). > Any response on this? Has anyone been able to reproduce the problem? -- Regards, Evgeniy Khramtsov, ProcessOne. xmpp:xram@REDACTED From spawn.think@REDACTED Mon Mar 21 10:59:41 2011 From: spawn.think@REDACTED (Ahmed Omar) Date: Mon, 21 Mar 2011 10:59:41 +0100 Subject: [erlang-bugs] Re: send_timeout doesn't work In-Reply-To: <4D86F17F.6060702@gmail.com> References: <4D7F7618.4040305@gmail.com> <4D86F17F.6060702@gmail.com> Message-ID: http://erlang.2086793.n4.nabble.com/tcp-connection-with-timeout-td2090360.html On Mon, Mar 21, 2011 at 7:34 AM, Evgeniy Khramtsov wrote: > 15.03.2011 23:22, Evgeniy Khramtsov wrote: > >> It seems like there is a bug in send_timeout option of a TCP socket: the >> timeout is completely ignored (at least in active-once mode). >> The code to reproduce: http://kuku.jabber.ru/~xram/lock.erl >> Just compile it and start lock:listen() in one shell and lock:send() in >> another: over a time you will see that the receiving process is locked in >> prim_inet:send/3 and doesn't process current message in the mailbox. You can >> also play with PORT and SEND_TIMEOUT macros if needed. >> >> Versions tested: R13B02 and R14B01 (on Debian 2.6.32-5-amd64 SMP). >> >> > Any response on this? Has anyone been able to reproduce the problem? > > > -- > Regards, > Evgeniy Khramtsov, ProcessOne. > xmpp:xram@REDACTED > > > ________________________________________________________________ > erlang-bugs (at) erlang.org mailing list. > See http://www.erlang.org/faq.html > To unsubscribe; mailto:erlang-bugs-unsubscribe@REDACTED > > -- Best Regards, - Ahmed Omar http://nl.linkedin.com/in/adiaa Follow me on twitter @spawn_think From xramtsov@REDACTED Mon Mar 21 11:18:49 2011 From: xramtsov@REDACTED (Evgeniy Khramtsov) Date: Mon, 21 Mar 2011 19:18:49 +0900 Subject: [erlang-bugs] Re: send_timeout doesn't work In-Reply-To: References: <4D7F7618.4040305@gmail.com> <4D86F17F.6060702@gmail.com> Message-ID: <4D872609.6060006@gmail.com> 21.03.2011 18:59, Ahmed Omar wrote: > http://erlang.2086793.n4.nabble.com/tcp-connection-with-timeout-td2090360.html > So what? How does that relate to the fact that send_timeout never works? -- Regards, Evgeniy Khramtsov, ProcessOne. xmpp:xram@REDACTED From pan@REDACTED Mon Mar 21 15:07:30 2011 From: pan@REDACTED (pan@REDACTED) Date: Mon, 21 Mar 2011 15:07:30 +0100 Subject: [erlang-bugs] send_timeout doesn't work In-Reply-To: <4D7F7618.4040305@gmail.com> References: <4D7F7618.4040305@gmail.com> Message-ID: Hi! Very good test program, it's obviously something wrong here, and that's the handling of timeouts when we are in active mode. It's broken. Please try the attached (very simple) patch, it should fix the problem. It's still not tested in our daily builds, but it will soon be. Any feedback is welcome! Cheers, /Patrik On Tue, 15 Mar 2011, Evgeniy Khramtsov wrote: > It seems like there is a bug in send_timeout option of a TCP socket: the > timeout is completely ignored (at least in active-once mode). > The code to reproduce: http://kuku.jabber.ru/~xram/lock.erl > Just compile it and start lock:listen() in one shell and lock:send() in > another: over a time you will see that the receiving process is locked in > prim_inet:send/3 and doesn't process current message in the mailbox. You can > also play with PORT and SEND_TIMEOUT macros if needed. > > Versions tested: R13B02 and R14B01 (on Debian 2.6.32-5-amd64 SMP). > > -- > Regards, > Evgeniy Khramtsov, ProcessOne. > xmpp:xram@REDACTED > > > ________________________________________________________________ > erlang-bugs (at) erlang.org mailing list. > See http://www.erlang.org/faq.html > To unsubscribe; mailto:erlang-bugs-unsubscribe@REDACTED > -------------- next part -------------- A non-text attachment was scrubbed... Name: tcp_send_timeout.diff Type: text/x-patch Size: 587 bytes Desc: fix for gen_tcp send timeouts URL: From igor@REDACTED Mon Mar 21 19:46:40 2011 From: igor@REDACTED (Igor Goryachev) Date: Mon, 21 Mar 2011 21:46:40 +0300 Subject: [erlang-bugs] segmentation fault in tree_delete at beam/erl_bestfit_alloc.c:431 In-Reply-To: (pan@erlang.org's message of "Fri, 18 Mar 2011 10:49:16 +0100") References: <871v29a0ze.fsf@goryachev.org> Message-ID: <87hbawwbu7.fsf@goryachev.org> Hi, Patrik. On Fri, Mar 18, 2011 at 12:49, somebody wrote: > First of all, I have to ask if you have some non-OTP drivers or NIF's > loaded in the VM? Have you loaded some native code not supplied in the > Erlang distribution? In that case, try to rule out errors in that code > and in libraries loaded by that code by e.g. disabling it in some way > (write slower erlang-replacements etc). We have pair of nodes per machine which are sort of frontend/backend and are speaking with each other using standard erlangish rpc. Frontend node (the one which segfaults) uses exmpp library by ProcessOne. Other third parties libraries (and my own code) do not contain non-OTP linked-in drivers and NIF's. > Next question is if you use some drivers or NIF's provided by us that > pull third party libraries, like Wx oc Crypto (by using SSL etc). If > we could isolate the problem to a driver (our's or your's) the > searchspace would be greatly reduced. We have no encryption here, but crypto application is loaded only for sha/1 usage. No wx, etc... > Also, looking at the core locally would possibly help me to identify > the type of data that has been written into the block, which possibly > could narrow it down, so if you could tar your compiled build tree and > the core and put it on something where I can fetch it (mail me > personally with the details, if you can do that), that would be > helpful. Ok, I will prepare a tarball as soon as possible and send you a link. > You say this is frequent. Is it in any way manually reproducable? Have > you got any idea of which erlang-code is run when this happens > (i.e. during some special kind of workload)? One possibility is that > this is a compiler error (in our compiler that is), so a module > triggering the proble m would also be interesting. It occurs two-three times during a day time. For now I have no idea how it could be reproduced manually. > Please make sure to run R14B02 and recompile all erlang code with the > latest Erlang version to rule out any bug that's already corrected :) Yes, I have already installed R14B02, but behaviour is the same. > Sorry for the big fluffy list of options, but as I said, this is a > kind of error that is really hard to track down... Thank you very much for your answer. I hope we resolve this issue. :-) -- Igor Goryachev From raimo+erlang-bugs@REDACTED Tue Mar 22 14:04:02 2011 From: raimo+erlang-bugs@REDACTED (Raimo Niskanen) Date: Tue, 22 Mar 2011 14:04:02 +0100 Subject: Mailing list software change Message-ID: <20110322130402.GC12691@erix.ericsson.se> Hi all. We will change servers and mailing list software from the old-fashion ezmlm to GNU Mailman on Thu Mar 24 afternoon (CET). All subscribtions will be transferred into the corresponding Mailman settings. The Mailman web interface is probably not up and running at first, we'll see about that until later. -- / Raimo Niskanen, Erlang/OTP, Ericsson AB From den@REDACTED Thu Mar 24 14:11:47 2011 From: den@REDACTED (Denis Afonin) Date: Thu, 24 Mar 2011 16:11:47 +0300 Subject: Orber application don`t depend to mnesia Message-ID: <20110324161147.1726b6d8@shimbo> Hi, In embedded mode orber application attempt to start before mnesia, so it`s crashing. Erlang version: debian, 1:14.a-dfsg-3. Regards, Denis. PS Here the patch: diff -Naur erlang-14.a-dfsg/lib/orber/src/orber.app.src erlang-14.a-dfsg.1/lib/orber/src/orber.app.src --- erlang-14.a-dfsg/lib/orber/src/orber.app.src 2011-03-24 13:07:00.221000018 +0000 +++ erlang-14.a-dfsg.1/lib/orber/src/orber.app.src 2011-03-24 13:06:14.045000018 +0000 @@ -101,7 +101,7 @@ orber_iiop_insup, orber_init, orber_reqno, orber_objkeyserver, orber_iiop_socketsup, orber_iiop_pm, orber_env]}, - {applications, [stdlib, kernel]}, + {applications, [stdlib, kernel, mnesia]}, {env, []}, {mod, {orber, []}} ]}. From mcbain@REDACTED Thu Mar 24 14:25:33 2011 From: mcbain@REDACTED (Carlo Bertoldi) Date: Thu, 24 Mar 2011 14:25:33 +0100 Subject: Time and system suspend Message-ID: Hi, I think I've found a bug. Version I'm using: Erlang R13B03 (erts-5.7.4) [source] [smp:2:2] [rq:2] [async-threads:0] [hipe] [kernel-poll:false] on Linux. Steps to reproduce the problem: open an Erlang shell, calendar:now_to_local_time(erlang:now()). It returns the correct time Suspend the computer without closing the erlang shell. Take a nap ;) Wake up the computer. calendar:now_to_local_time(erlang:now()). Now I can tell when I went to sleep, because the time printed is the time at the moment of the suspension, plus the time passed since the wake up. Please note that the system clock is fine. To double check, I quit the erl shell, than fired it up again, and then the time displayed was correct. Regards, Carlo Bertoldi -- ? molto pi? bello sapere qualcosa di tutto, che sapere tutto di una cosa. Blaise Pascal From hm@REDACTED Thu Mar 24 17:13:32 2011 From: hm@REDACTED (=?ISO-8859-1?Q?H=E5kan_Mattsson?=) Date: Thu, 24 Mar 2011 17:13:32 +0100 Subject: [erlang-bugs 1] Re: [erlang-bugs] Orber application don`t depend to mnesia In-Reply-To: <20110324161147.1726b6d8@shimbo> References: <20110324161147.1726b6d8@shimbo> Message-ID: Orber can make use of Mnesia, but the usage is optional. /H?kan On Thu, Mar 24, 2011 at 2:11 PM, Denis Afonin wrote: > Hi, > > In embedded mode orber application attempt to start before mnesia, so > it`s crashing. > > Erlang version: debian, 1:14.a-dfsg-3. > > Regards, > Denis. > > PS Here the patch: > > diff -Naur erlang-14.a-dfsg/lib/orber/src/orber.app.src erlang-14.a-dfsg.1/lib/orber/src/orber.app.src > --- erlang-14.a-dfsg/lib/orber/src/orber.app.src ? ? ? ?2011-03-24 13:07:00.221000018 +0000 > +++ erlang-14.a-dfsg.1/lib/orber/src/orber.app.src ? ? ?2011-03-24 13:06:14.045000018 +0000 > @@ -101,7 +101,7 @@ > ? ? ? ? ? ? ? ?orber_iiop_insup, orber_init, orber_reqno, > ? ? ? ? ? ? ? ?orber_objkeyserver, orber_iiop_socketsup, > ? ? ? ? ? ? ? ? orber_iiop_pm, orber_env]}, > - ?{applications, [stdlib, kernel]}, > + ?{applications, [stdlib, kernel, mnesia]}, > ? {env, []}, > ? {mod, {orber, []}} > ?]}. > > ________________________________________________________________ > erlang-bugs (at) erlang.org mailing list. > See http://www.erlang.org/faq.html > To unsubscribe; mailto:erlang-bugs-unsubscribe@REDACTED > > From attila.r.nohl@REDACTED Thu Mar 24 17:49:19 2011 From: attila.r.nohl@REDACTED (Attila Rajmund Nohl) Date: Thu, 24 Mar 2011 17:49:19 +0100 Subject: [erlang-bugs 2] Re: [erlang-bugs] Time and system suspend In-Reply-To: References: Message-ID: 2011/3/24, Carlo Bertoldi : > Hi, I think I've found a bug. > Version I'm using: Erlang R13B03 (erts-5.7.4) [source] [smp:2:2] > [rq:2] [async-threads:0] [hipe] [kernel-poll:false] > on Linux. > > Steps to reproduce the problem: > open an Erlang shell, > calendar:now_to_local_time(erlang:now()). It returns the correct > time > > Suspend the computer without closing the erlang shell. > Take a nap ;) > Wake up the computer. > calendar:now_to_local_time(erlang:now()). > > Now I can tell when I went to sleep, because the time printed is the > time at the moment of the suspension, plus the time > passed since the wake up. Please note that the system clock is fine. > To double check, I quit the erl shell, than fired it up again, and > then the time displayed was correct. erlang:now() does not return the current time (despite its documentation), but a tuple that is guaranteed to continuously increase for subsequent calls. Use the os:timestamp() to get the current time. From andrew@REDACTED Thu Mar 24 18:35:13 2011 From: andrew@REDACTED (Andrew Thompson) Date: Thu, 24 Mar 2011 13:35:13 -0400 Subject: [erlang-bugs 3] Bug in -spec/@doc ordering in new edoc Message-ID: <20110324173513.GK20461@hijacked.us> Hi, I just noticed an odd behaviour where the order in which a @doc and a -spec appear affects whether the @doc appears in the edoc output. If I put the -spec first, the only documentation for that function is the spec, if I put the @doc first, they both show up. Here's the workaround commit I had to make: https://github.com/Vagabond/gen_smtp/commit/accfd881e92ae59946444987568217ba4bfa80c4 For some reason, the other two functions exported from that file work fine with the 'spec before doc' style, just not this for this particular function. Andrew From erlangsiri@REDACTED Fri Mar 25 09:48:51 2011 From: erlangsiri@REDACTED (Siri Hansen) Date: Fri, 25 Mar 2011 09:48:51 +0100 Subject: [erlang-bugs 4] reltool's app_file option Message-ID: This has been corrected and will be included in R14B03 Thanks for the contribution! Regards /siri > The documentation lists 'keep', 'strip' and 'all' as valid values, but > only 'keep' is allowed. The others give you an exit with "Illegal > option: {app_file,all}". > > The following line in reltool_server.erl needs Val to be both 'strip' > and 'all' simultaneously: > > app_file when Val =:= keep; Val =:= strip, Val =:= all -> > > In 0.5.3 (R13B04) and the dev branch. -------------- next part -------------- An HTML attachment was scrubbed... URL: From nick@REDACTED Fri Mar 25 10:51:44 2011 From: nick@REDACTED (Niclas Eklund) Date: Fri, 25 Mar 2011 10:51:44 +0100 Subject: [erlang-bugs 5] Re: [erlang-bugs] Orber application don`t depend to mnesia In-Reply-To: <20110324161147.1726b6d8@shimbo> References: <20110324161147.1726b6d8@shimbo> Message-ID: Hello! Thank you for reporting this, but this has already been changed and released inte the latest version (R14B02/orber-3.6.20). Best Regards, Niclas @ Erlang/OTP On Thu, 24 Mar 2011, Denis Afonin wrote: > Hi, > > In embedded mode orber application attempt to start before mnesia, so > it`s crashing. > > Erlang version: debian, 1:14.a-dfsg-3. > > Regards, > Denis. > > PS Here the patch: > > diff -Naur erlang-14.a-dfsg/lib/orber/src/orber.app.src erlang-14.a-dfsg.1/lib/orber/src/orber.app.src > --- erlang-14.a-dfsg/lib/orber/src/orber.app.src 2011-03-24 13:07:00.221000018 +0000 > +++ erlang-14.a-dfsg.1/lib/orber/src/orber.app.src 2011-03-24 13:06:14.045000018 +0000 > @@ -101,7 +101,7 @@ > orber_iiop_insup, orber_init, orber_reqno, > orber_objkeyserver, orber_iiop_socketsup, > orber_iiop_pm, orber_env]}, > - {applications, [stdlib, kernel]}, > + {applications, [stdlib, kernel, mnesia]}, > {env, []}, > {mod, {orber, []}} > ]}. > > ________________________________________________________________ > erlang-bugs (at) erlang.org mailing list. > See http://www.erlang.org/faq.html > To unsubscribe; mailto:erlang-bugs-unsubscribe@REDACTED > From ulf.wiger@REDACTED Fri Mar 25 11:42:27 2011 From: ulf.wiger@REDACTED (Ulf Wiger) Date: Fri, 25 Mar 2011 11:42:27 +0100 Subject: [erlang-bugs 6] make release_tests fails if wxWidgets is not installed Message-ID: When building OTP on a (Mac) without wx installed, make works, but make release_tests fails, not just for wx, but for et and reltool as well. Putting in SKIP files doesn't help, but renaming the 'test' directory in those apps does. BR, Ulf Ulf Wiger, CTO, Erlang Solutions, Ltd. http://erlang-solutions.com From fdmanana@REDACTED Sun Mar 27 19:07:03 2011 From: fdmanana@REDACTED (Filipe David Manana) Date: Sun, 27 Mar 2011 18:07:03 +0100 Subject: [erlang-bugs 7] possible supervisor bug in r14b02 Message-ID: Hi, In R14B02 I noticed that for a child with a "temporary" restart_type, we discard its A component of the MFA tuple when adding the childspec to the list of the supervisor's children [1]. When the child terminates, its spec is never removed from the list of the supervisor's children specs. Then if we call supervisor:restart_child/2 after the child terminates, the handle_call clause for restart_child gets the childspec with an MFA that is {M, F, undefined} [2]. At that point do_start_child will call apply(M, F, undefined) [3] which will cause the supervisor to reply with an error, instead of returning {ok, Pid} as in previous releases. An example for the returned error: {error,{'EXIT',{badarg,[{erlang,apply,[gen_server,start_link,undefined]}, {supervisor,do_start_child,2}, {supervisor,handle_call,3}, {gen_server,handle_msg,5}, {proc_lib,init_p_do_apply,3}]}}} The patch at [4] fixes the issue for me. The particular code that is no longer working in R14B02 but worked on all previous releases, is from Apache CouchDB, see [5] This issue was introuced by OTP-9064 (reading from the R14B02 release notes). Was this intended behaviour? It doesn't make much sense for me to keep a temporary childspec in the supervisor once the child terminates, so I believe deleting it from the state is the right thing to do. [1] - https://github.com/erlang/otp/blob/dev/lib/stdlib/src/supervisor.erl#L787 [2] - https://github.com/erlang/otp/blob/dev/lib/stdlib/src/supervisor.erl#L314 [3] - https://github.com/erlang/otp/blob/dev/lib/stdlib/src/supervisor.erl#L246 [4] - https://github.com/fdmanana/otp/commit/2697042aa9ebab2fcd208c93b7f454b25bc580d4 [5] - https://github.com/apache/couchdb/blob/trunk/src/couchdb/couch_replicator.erl#L119 -- Filipe David Manana, fdmanana@REDACTED, fdmanana@REDACTED "Reasonable men adapt themselves to the world. ?Unreasonable men adapt the world to themselves. ?That's why all progress depends on unreasonable men." From ingela@REDACTED Mon Mar 28 10:24:59 2011 From: ingela@REDACTED (Ingela Anderton Andin) Date: Mon, 28 Mar 2011 10:24:59 +0200 Subject: [erlang-bugs 8] Re: possible supervisor bug in r14b02 In-Reply-To: References: Message-ID: <4D9045DB.5060509@erix.ericsson.se> Hi! We will take a look at your patch it sounds like it is the right thing to do. Temporary processes should not be restarted so you should not have to save their init-arguments, although until the last release they where saved so it was possible to restart them! Especially for simple_one_for_one supervisors that may have lots of temporary processes memory consumption can go sky high if you save them. Regards Ingela Erlang OTP team - Ericsson AB Filipe David Manana wrote: > Hi, > > In R14B02 I noticed that for a child with a "temporary" restart_type, > we discard its A component of the MFA tuple when adding the childspec > to the list of the supervisor's children [1]. > > When the child terminates, its spec is never removed from the list of > the supervisor's children specs. > Then if we call supervisor:restart_child/2 after the child terminates, > the handle_call clause for restart_child gets the childspec with an > MFA that is {M, F, undefined} [2]. At that point do_start_child will > call apply(M, F, undefined) [3] which will cause the supervisor to > reply with an error, instead of returning {ok, Pid} as in previous > releases. An example for the returned error: > > {error,{'EXIT',{badarg,[{erlang,apply,[gen_server,start_link,undefined]}, > {supervisor,do_start_child,2}, > {supervisor,handle_call,3}, > {gen_server,handle_msg,5}, > {proc_lib,init_p_do_apply,3}]}}} > > The patch at [4] fixes the issue for me. The particular code that is > no longer working in R14B02 but worked on all previous releases, is > from Apache CouchDB, see [5] > > This issue was introuced by OTP-9064 (reading from the R14B02 release notes). > Was this intended behaviour? It doesn't make much sense for me to keep > a temporary childspec in the supervisor once the child terminates, so > I believe deleting it from the state is the right thing to do. > > [1] - https://github.com/erlang/otp/blob/dev/lib/stdlib/src/supervisor.erl#L787 > [2] - https://github.com/erlang/otp/blob/dev/lib/stdlib/src/supervisor.erl#L314 > [3] - https://github.com/erlang/otp/blob/dev/lib/stdlib/src/supervisor.erl#L246 > [4] - https://github.com/fdmanana/otp/commit/2697042aa9ebab2fcd208c93b7f454b25bc580d4 > [5] - https://github.com/apache/couchdb/blob/trunk/src/couchdb/couch_replicator.erl#L119 > > From fdmanana@REDACTED Mon Mar 28 12:16:11 2011 From: fdmanana@REDACTED (Filipe David Manana) Date: Mon, 28 Mar 2011 11:16:11 +0100 Subject: [erlang-bugs 9] Re: possible supervisor bug in r14b02 In-Reply-To: <4D9045DB.5060509@erix.ericsson.se> References: <4D9045DB.5060509@erix.ericsson.se> Message-ID: Thanks :) On Mon, Mar 28, 2011 at 9:24 AM, Ingela Anderton Andin wrote: > Hi! > > We will take a look at your patch it sounds like it is the right thing to > do. > Temporary processes should not be restarted so you should not have > to save their ?init-arguments, although until the last release they where > saved > so it was possible to restart them! Especially for simple_one_for_one > supervisors > that may have lots of temporary processes memory consumption can > go sky high if you save them. > > Regards ?Ingela Erlang OTP team - Ericsson AB > > Filipe David Manana wrote: >> >> Hi, >> >> In R14B02 I noticed that for a child with a "temporary" restart_type, >> we discard its A component of the MFA tuple when adding the childspec >> to the list of the supervisor's children [1]. >> >> When the child terminates, its spec is never removed from the list of >> the supervisor's children specs. >> Then if we call supervisor:restart_child/2 after the child terminates, >> the handle_call clause for restart_child gets the childspec with an >> MFA ?that is {M, F, undefined} [2]. At that point do_start_child will >> call apply(M, F, undefined) [3] which will cause the supervisor to >> reply with an error, instead of returning {ok, Pid} as in previous >> releases. An example for the returned error: >> >> ?{error,{'EXIT',{badarg,[{erlang,apply,[gen_server,start_link,undefined]}, >> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?{supervisor,do_start_child,2}, >> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?{supervisor,handle_call,3}, >> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?{gen_server,handle_msg,5}, >> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?{proc_lib,init_p_do_apply,3}]}}} >> >> The patch at [4] fixes the issue for me. The particular code that is >> no longer working in R14B02 but worked on all previous releases, is >> from Apache CouchDB, see [5] >> >> This issue was introuced by OTP-9064 (reading from the R14B02 release >> notes). >> Was this intended behaviour? It doesn't make much sense for me to keep >> a temporary childspec in the supervisor once the child terminates, so >> I believe deleting it from the state is the right thing to do. >> >> [1] - >> https://github.com/erlang/otp/blob/dev/lib/stdlib/src/supervisor.erl#L787 >> [2] - >> https://github.com/erlang/otp/blob/dev/lib/stdlib/src/supervisor.erl#L314 >> [3] - >> https://github.com/erlang/otp/blob/dev/lib/stdlib/src/supervisor.erl#L246 >> [4] - >> https://github.com/fdmanana/otp/commit/2697042aa9ebab2fcd208c93b7f454b25bc580d4 >> [5] - >> https://github.com/apache/couchdb/blob/trunk/src/couchdb/couch_replicator.erl#L119 >> >> > > _______________________________________________ > erlang-bugs mailing list > erlang-bugs@REDACTED > http://erlang.org/mailman/listinfo/erlang-bugs > -- Filipe David Manana, fdmanana@REDACTED, fdmanana@REDACTED "Reasonable men adapt themselves to the world. ?Unreasonable men adapt the world to themselves. ?That's why all progress depends on unreasonable men." From philippu@REDACTED Mon Mar 28 12:38:57 2011 From: philippu@REDACTED (Philipp Unterbrunner) Date: Mon, 28 Mar 2011 12:38:57 +0200 Subject: [erlang-bugs 10] Re: [erlang-bugs] Distributed node crashes silently when initially receiving a big chunk of messages from another node In-Reply-To: <4D652468.7000404@inf.ethz.ch> References: <4D652468.7000404@inf.ethz.ch> Message-ID: <4D906541.2060506@inf.ethz.ch> The bug persists in r14b02. If I find time, I will make a small demo application so that others can reproduce the bug. Philipp On 02/23/2011 04:14 PM, Philipp Unterbrunner wrote: > Hello, > > I have run into a serious and very annoying bug. > > Affects (at least); R13B04, R14A, R14B, R14B01 > Platform: Ubuntu Linux 10.10, kernel 2.6.35-25-server (SMP) > > When a newly started distributed node receives a high number of messages from another node, the newly started node crashes silently. Nothing is printed to the console. No crash dump or core dump is produced. > > In trying to find a work-around, I found the following curious behavior: > > * The bug *only* occurs for distributed nodes (but regardless of whether the nodes run on the same machine). > * Waiting a few seconds (or even longer) before sending the first message to the newly started node does *not* make a difference. The node will still crash when confronted with a large number of incoming messages later. > * Speed matters. When doing a debug build, the bug appears less often then when doing a release build, especially when HiPE is enabled. However, I managed to cause the bug even in debug mode, and when OTP was not compiled with native libs. The bug is simply much less likely to be observed. > * The number of messages sent *initially* matters most. Slowly "ramping up" the load is a work-around. Once a node is working at high throughput, it is OK to stop sending messages for an arbitrary period and at a later point send a big chunk of messages that would have killed the node if sent initially. > * Timing matters. Running the receiver node with +T 7 or higher makes the problem disappear. > * Setting the sender node's distribution buffer size to the minimum (+zdbbl 1) makes the problem appear less often. > > I have reproduced the bug in various applications. The behavior described above also makes it fairly obvious that the application is not at fault. > > Rather, it appears that the receiver node is unable to buffer incoming messages and crashes. Of particular interest here is the fact that "ramping up" the load is a work-around. I suspect a low-level race condition where the receiver node does not allocate sufficient buffer space in time and crashes. > > Given that the existing work-arounds are not desirable ("ramp up" requires changes to the application code, +T 7 and +zdbbl 1 decrease performance), and given that the bug now persists over multiple releases, I hope someone can soon look into it. > > Thank you, > > Philipp -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 262 bytes Desc: OpenPGP digital signature URL: From emile@REDACTED Tue Mar 29 13:46:19 2011 From: emile@REDACTED (Emile Joubert) Date: Tue, 29 Mar 2011 12:46:19 +0100 Subject: [erlang-bugs] crypto from windows service? Message-ID: <4D91C68B.2070702@rabbitmq.com> Hi, I'm unable to start the crypto module in an Erlang VM installed as a Windows service, if that service has any stopaction or a debugtype specified. Here are the steps to reproduce: > erlsrv add test -st halt() -sn test@REDACTED > erlsrv start test > werl.exe -remsh test@REDACTED -sname tmp (in the werl window) 1> crypto:start(). When left long enough this leads to ** Node 'test@REDACTED' not responding ** ** Removing (timedout) connection ** Specifying a debugtype of new or reuse also leads to a timeout: > erlsrv add test -de new -sn test@REDACTED > erlsrv start test > werl.exe -remsh test@REDACTED -sname tmp (in the werl window) 1> crypto:start(). When the service is installed without a stopaction and without a debugtype specified then the crypto module works fine: > erlsrv add test -sn test > erlsrv start test > werl.exe -remsh test@REDACTED -sname tmp (in the werl window) 1> crypto:rand_bytes(1). <<"?">> I've observed this behaviour on version R14B01 and R14B02 on Windows XP 32bit. Is this a known issue and is there a better workaround than not specifying stopaction or a debugtype ? Regards Emile From kruber@REDACTED Tue Mar 29 15:06:46 2011 From: kruber@REDACTED (Nico Kruber) Date: Tue, 29 Mar 2011 15:06:46 +0200 Subject: [erlang-bugs] UTF8 string handling in different erlang:*** functions Message-ID: <201103291506.57110.kruber@zib.de> is it possible that UTF8 strings are not supported by both erlang:md5/1 and erlang:list_to_binary/1 (and possibly more?) I'm getting a bad argument exception when running the following: > erlang:md5("W?grain (W?gr??)"). ** exception error: bad argument in function erlang:md5/1 called as erlang:md5([87,224,103,114,97,105,110,32,40,87,229,103,114,335,227, 41]) even simpler, one can call: > erlang:md5([256]). ** exception error: bad argument in function erlang:md5/1 called as erlang:md5([256]) for characters larger than 255, this exception is thrown. same for erlang:list_to_binary/1. Both state that the input should be an iodata() or iolist() which are defined as: iodata() = iolist() | binary() iolist() = [char() | binary() | iolist()] % a binary is allowed as the tail of the list And according to http://www.erlang.org/doc/reference_manual/typespec.html a character is any valid integer between 0 and 16#10ffff and it should be this way since erlang strings are unicode strings. If this is correct behaviour, then how do I hash a unicode string without using erlang:term_to_binary/1 (which is possibly costly and should be unnecessary). Regards Nico -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 198 bytes Desc: This is a digitally signed message part. URL: From bob@REDACTED Tue Mar 29 15:17:03 2011 From: bob@REDACTED (Bob Ippolito) Date: Tue, 29 Mar 2011 09:17:03 -0400 Subject: [erlang-bugs] Re: UTF8 string handling in different erlang:*** functions In-Reply-To: <201103291506.57110.kruber@zib.de> References: <201103291506.57110.kruber@zib.de> Message-ID: On Tue, Mar 29, 2011 at 9:06 AM, Nico Kruber wrote: > is it possible that UTF8 strings are not supported by both > erlang:md5/1 and > erlang:list_to_binary/1 (and possibly more?) > > I'm getting a bad argument exception when running the following: > >> erlang:md5("W?grain (W?gr??)"). > ** exception error: bad argument > ? ? in function ?erlang:md5/1 > ? ? ? ?called as > erlang:md5([87,224,103,114,97,105,110,32,40,87,229,103,114,335,227, > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?41]) > > even simpler, one can call: >> erlang:md5([256]). > ** exception error: bad argument > ? ? in function ?erlang:md5/1 > ? ? ? ?called as erlang:md5([256]) > > > for characters larger than 255, this exception is thrown. same for > erlang:list_to_binary/1. > > Both state that the input should be an iodata() or iolist() which are defined > as: > > iodata() = iolist() | binary() > iolist() = [char() | binary() | iolist()] > % ?a binary is allowed as the tail of the list > > And according to > http://www.erlang.org/doc/reference_manual/typespec.html > a character is any valid integer between 0 and 16#10ffff and it should be this > way since erlang strings are unicode strings. > > If this is correct behaviour, then how do I hash a unicode string without > using erlang:term_to_binary/1 (which is possibly costly and should be > unnecessary). What you have is not UTF8, because UTF8 is defined over bytes (0..255). IIRC, the actual definition of iolist should be maybe_improper_list(byte() | binary() | iolist(), binary()). Functions like erlang:list_to_binary/1 and erlang:md5/1 also only make sense over bytes. You can convert a list of unicode code points (L) to UTF8 with unicode:characters_to_binary(L, utf8). -bob From pan@REDACTED Tue Mar 29 15:26:11 2011 From: pan@REDACTED (pan@REDACTED) Date: Tue, 29 Mar 2011 15:26:11 +0200 Subject: [erlang-bugs] Re: [erlang-bugs 10] Re: Distributed node crashes silently when initially receiving a big chunk of messages from another node In-Reply-To: <4D906541.2060506@inf.ethz.ch> References: <4D652468.7000404@inf.ethz.ch> <4D906541.2060506@inf.ethz.ch> Message-ID: Hi! This sounds really bad! A demo application that reproduces the bug would be really nice. Have you tried to enable core dumps to see if the erlang node crashes with a segfault? I suppose there are no erl_crash.dump files left after the crash that I can look at either? Any way to reproduce it would make it more easy to find! Cheers, /Patrik On Mon, 28 Mar 2011, Philipp Unterbrunner wrote: > The bug persists in r14b02. > > If I find time, I will make a small demo application so that others can > reproduce the bug. > > Philipp > > On 02/23/2011 04:14 PM, Philipp Unterbrunner wrote: >> Hello, >> >> I have run into a serious and very annoying bug. >> >> Affects (at least); R13B04, R14A, R14B, R14B01 >> Platform: Ubuntu Linux 10.10, kernel 2.6.35-25-server (SMP) >> >> When a newly started distributed node receives a high number of messages from another node, the newly started node crashes silently. Nothing is printed to the console. No crash dump or core dump is produced. >> >> In trying to find a work-around, I found the following curious behavior: >> >> * The bug *only* occurs for distributed nodes (but regardless of whether the nodes run on the same machine). >> * Waiting a few seconds (or even longer) before sending the first message to the newly started node does *not* make a difference. The node will still crash when confronted with a large number of incoming messages later. >> * Speed matters. When doing a debug build, the bug appears less often then when doing a release build, especially when HiPE is enabled. However, I managed to cause the bug even in debug mode, and when OTP was not compiled with native libs. The bug is simply much less likely to be observed. >> * The number of messages sent *initially* matters most. Slowly "ramping up" the load is a work-around. Once a node is working at high throughput, it is OK to stop sending messages for an arbitrary period and at a later point send a big chunk of messages that would have killed the node if sent initially. >> * Timing matters. Running the receiver node with +T 7 or higher makes the problem disappear. >> * Setting the sender node's distribution buffer size to the minimum (+zdbbl 1) makes the problem appear less often. >> >> I have reproduced the bug in various applications. The behavior described above also makes it fairly obvious that the application is not at fault. >> >> Rather, it appears that the receiver node is unable to buffer incoming messages and crashes. Of particular interest here is the fact that "ramping up" the load is a work-around. I suspect a low-level race condition where the receiver node does not allocate sufficient buffer space in time and crashes. >> >> Given that the existing work-arounds are not desirable ("ramp up" requires changes to the application code, +T 7 and +zdbbl 1 decrease performance), and given that the bug now persists over multiple releases, I hope someone can soon look into it. >> >> Thank you, >> >> Philipp > From pan@REDACTED Tue Mar 29 16:08:26 2011 From: pan@REDACTED (pan@REDACTED) Date: Tue, 29 Mar 2011 16:08:26 +0200 Subject: [erlang-bugs] Re: crypto from windows service? In-Reply-To: <4D91C68B.2070702@rabbitmq.com> References: <4D91C68B.2070702@rabbitmq.com> Message-ID: Hi! I am unable to reproduce the problem, but a wild guess would be that the openssl libraries (dll's) get messed up in some way by the small differences in process creation when you connect the stdout/stdin to a pipe. Have you tried updating openssl on the machine? What happens if you specify debugtype console? Does anything show up in the debug log or in the event viewer when the node crashes? Does the node really crash or is it only the connection that fails? Cheers, /Patrik On Tue, 29 Mar 2011, Emile Joubert wrote: > Hi, > > I'm unable to start the crypto module in an Erlang VM installed as a Windows > service, if that service has any stopaction or a debugtype specified. > > Here are the steps to reproduce: > >> erlsrv add test -st halt() -sn test@REDACTED >> erlsrv start test >> werl.exe -remsh test@REDACTED -sname tmp > (in the werl window) > 1> crypto:start(). > > When left long enough this leads to > > ** Node 'test@REDACTED' not responding ** > ** Removing (timedout) connection ** > > Specifying a debugtype of new or reuse also leads to a timeout: > >> erlsrv add test -de new -sn test@REDACTED >> erlsrv start test >> werl.exe -remsh test@REDACTED -sname tmp > (in the werl window) > 1> crypto:start(). > > When the service is installed without a stopaction and without a debugtype > specified then the crypto module works fine: > >> erlsrv add test -sn test >> erlsrv start test >> werl.exe -remsh test@REDACTED -sname tmp > (in the werl window) > 1> crypto:rand_bytes(1). > <<"?">> > > I've observed this behaviour on version R14B01 and R14B02 on Windows XP > 32bit. Is this a known issue and is there a better workaround than not > specifying stopaction or a debugtype ? > > > Regards > > Emile > > > _______________________________________________ > erlang-bugs mailing list > erlang-bugs@REDACTED > http://erlang.org/mailman/listinfo/erlang-bugs > From kruber@REDACTED Tue Mar 29 16:10:28 2011 From: kruber@REDACTED (Nico Kruber) Date: Tue, 29 Mar 2011 16:10:28 +0200 Subject: [erlang-bugs] Re: UTF8 string handling in different erlang:*** functions In-Reply-To: References: <201103291506.57110.kruber@zib.de> Message-ID: <201103291610.35983.kruber@zib.de> On Tuesday 29 March 2011 15:17:03 Bob Ippolito wrote: > On Tue, Mar 29, 2011 at 9:06 AM, Nico Kruber wrote: > > is it possible that UTF8 strings are not supported by both > > erlang:md5/1 and > > erlang:list_to_binary/1 (and possibly more?) > > > > I'm getting a bad argument exception when running the following: > >> erlang:md5("W?grain (W?gr??)"). > > > > ** exception error: bad argument > > in function erlang:md5/1 > > called as > > erlang:md5([87,224,103,114,97,105,110,32,40,87,229,103,114,335,227, > > 41]) > > > > even simpler, one can call: > >> erlang:md5([256]). > > > > ** exception error: bad argument > > in function erlang:md5/1 > > called as erlang:md5([256]) > > > > > > for characters larger than 255, this exception is thrown. same for > > erlang:list_to_binary/1. > > > > Both state that the input should be an iodata() or iolist() which are > > defined as: > > > > iodata() = iolist() | binary() > > iolist() = [char() | binary() | iolist()] > > % a binary is allowed as the tail of the list > > > > And according to > > http://www.erlang.org/doc/reference_manual/typespec.html > > a character is any valid integer between 0 and 16#10ffff and it should be > > this way since erlang strings are unicode strings. > > > > If this is correct behaviour, then how do I hash a unicode string without > > using erlang:term_to_binary/1 (which is possibly costly and should be > > unnecessary). > > What you have is not UTF8, because UTF8 is defined over bytes > (0..255). oh, right - this was maybe misleading, I should have rather said "erlang string" > IIRC, the actual definition of iolist should be > maybe_improper_list(byte() | binary() | iolist(), binary()). Functions > like erlang:list_to_binary/1 and erlang:md5/1 also only make sense > over bytes. ok, makes sense, although it is rather inconvenient not being able to hash strings :( > You can convert a list of unicode code points (L) to UTF8 with > unicode:characters_to_binary(L, utf8). ok, thanks for the tip - FYI, I ran a simple benchmark executing unicode:characters_to_binary/1 and erlang:term_to_binary/1 a Million times with the same string which resulted in the following: > 1000000 iterations of "erlang:term_to_binary/1" took 0.02946s: 33944331.2966734541/s > 1000000 iterations of "unicode:characters_to_binary/1" took 0.667519s: 1498084.69871269591/s -> looks like I should chose erlang:term_to_binary/1 since at least on my machine is is around twice as fast. Nico -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 198 bytes Desc: This is a digitally signed message part. URL: From bob@REDACTED Tue Mar 29 16:29:38 2011 From: bob@REDACTED (Bob Ippolito) Date: Tue, 29 Mar 2011 10:29:38 -0400 Subject: [erlang-bugs] Re: UTF8 string handling in different erlang:*** functions In-Reply-To: <201103291610.35983.kruber@zib.de> References: <201103291506.57110.kruber@zib.de> <201103291610.35983.kruber@zib.de> Message-ID: On Tue, Mar 29, 2011 at 10:10 AM, Nico Kruber wrote: > On Tuesday 29 March 2011 15:17:03 Bob Ippolito wrote: >> On Tue, Mar 29, 2011 at 9:06 AM, Nico Kruber wrote: >> > is it possible that UTF8 strings are not supported by both >> > erlang:md5/1 and >> > erlang:list_to_binary/1 (and possibly more?) >> > >> > I'm getting a bad argument exception when running the following: >> >> erlang:md5("W?grain (W?gr??)"). >> > >> > ** exception error: bad argument >> > ? ? in function ?erlang:md5/1 >> > ? ? ? ?called as >> > erlang:md5([87,224,103,114,97,105,110,32,40,87,229,103,114,335,227, >> > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?41]) >> > >> > even simpler, one can call: >> >> erlang:md5([256]). >> > >> > ** exception error: bad argument >> > ? ? in function ?erlang:md5/1 >> > ? ? ? ?called as erlang:md5([256]) >> > >> > >> > for characters larger than 255, this exception is thrown. same for >> > erlang:list_to_binary/1. >> > >> > Both state that the input should be an iodata() or iolist() which are >> > defined as: >> > >> > iodata() = iolist() | binary() >> > iolist() = [char() | binary() | iolist()] >> > % ?a binary is allowed as the tail of the list >> > >> > And according to >> > http://www.erlang.org/doc/reference_manual/typespec.html >> > a character is any valid integer between 0 and 16#10ffff and it should be >> > this way since erlang strings are unicode strings. >> > >> > If this is correct behaviour, then how do I hash a unicode string without >> > using erlang:term_to_binary/1 (which is possibly costly and should be >> > unnecessary). >> >> What you have is not UTF8, because UTF8 is defined over bytes >> (0..255). > > oh, right - this was maybe misleading, I should have rather said "erlang > string" > >> IIRC, the actual definition of iolist should be >> maybe_improper_list(byte() | binary() | iolist(), binary()). Functions >> like erlang:list_to_binary/1 and erlang:md5/1 also only make sense >> over bytes. > > ok, makes sense, although it is rather inconvenient not being able to hash > strings :( The real lesson here is "do not use erlang strings". Binaries in UTF8 are better for most use cases that I've come across in the past few years. A bit uglier in the source, but the memory and performance benefits make it worthwhile. >> You can convert a list of unicode code points (L) to UTF8 with >> unicode:characters_to_binary(L, utf8). > > ok, thanks for the tip - FYI, I ran a simple benchmark executing > unicode:characters_to_binary/1 and erlang:term_to_binary/1 a Million times > with the same string which resulted in the following: > >> 1000000 iterations of "erlang:term_to_binary/1" took 0.02946s: > 33944331.2966734541/s >> 1000000 iterations of "unicode:characters_to_binary/1" took 0.667519s: > 1498084.69871269591/s > > -> looks like I should chose erlang:term_to_binary/1 since at least on my > machine is is around twice as fast. I guess it depends on if you care what the result is... these operations are completely different, and there's not even any guarantee that erlang:term_to_binary/1 is always going to return the same output for a given input... there is more than one possible representation for a string in external term format, and the spec does not guarantee that the implementation will do it any particular way. -bob From psa@REDACTED Tue Mar 29 16:34:22 2011 From: psa@REDACTED (=?UTF-8?B?UGF1bG8gU8OpcmdpbyBBbG1laWRh?=) Date: Tue, 29 Mar 2011 15:34:22 +0100 Subject: [erlang-bugs] Re: UTF8 string handling in different erlang:*** functions In-Reply-To: <201103291610.35983.kruber@zib.de> References: <201103291506.57110.kruber@zib.de> <201103291610.35983.kruber@zib.de> Message-ID: <4D91EDEE.1070300@di.uminho.pt> On 3/29/11 3:10 PM, Nico Kruber wrote: >> 1000000 iterations of "erlang:term_to_binary/1" took 0.02946s: > 33944331.2966734541/s >> 1000000 iterations of "unicode:characters_to_binary/1" took 0.667519s: > 1498084.69871269591/s > > -> looks like I should chose erlang:term_to_binary/1 since at least on my > machine is is around twice as fast. It's not twice but 20 times as fast. Amazing. Even though it should be slower, this slower is surprising. Regards, Paulo From kostis@REDACTED Tue Mar 29 17:34:29 2011 From: kostis@REDACTED (Kostis Sagonas) Date: Tue, 29 Mar 2011 18:34:29 +0300 Subject: [erlang-bugs] Re: UTF8 string handling in different erlang:*** functions In-Reply-To: <4D91EDEE.1070300@di.uminho.pt> References: <201103291506.57110.kruber@zib.de> <201103291610.35983.kruber@zib.de> <4D91EDEE.1070300@di.uminho.pt> Message-ID: <4D91FC05.4090802@cs.ntua.gr> Paulo S?rgio Almeida wrote: > On 3/29/11 3:10 PM, Nico Kruber wrote: >>> 1000000 iterations of "erlang:term_to_binary/1" took 0.02946s: >> 33944331.2966734541/s >>> 1000000 iterations of "unicode:characters_to_binary/1" took 0.667519s: >> 1498084.69871269591/s >> >> -> looks like I should chose erlang:term_to_binary/1 since at least >> on my >> machine is is around twice as fast. > It's not twice but 20 times as fast. Amazing. Even though it should be > slower, this slower is surprising. I have trouble reproducing these numbers, both the 2 and the 20. With the program at the end of this mail, on an x86_64, I get: Eshell V5.8.3 (abort with ^G) 1> c(t). {ok,t} 2> timer:tc(t, t2b, [1000000]). {133505,ok} 3> timer:tc(t, c2b, [1000000]). {636624,ok} which makes the term_to_binary version about 4 times as fast on this machine. On a 32-bit machine the difference is about 6 - 6.5 times. Kostis %%============================================================== -module(t). -export([t2b/1, c2b/1]). -define(S, "some medium sized string here"). t2b(N) -> lists:foreach(fun (_) -> erlang:term_to_binary(?S) end, lists:seq(1,N)). c2b(N) -> lists:foreach(fun (_) -> unicode:characters_to_binary(?S) end, lists:seq(1,N)). From kruber@REDACTED Tue Mar 29 17:57:06 2011 From: kruber@REDACTED (Nico Kruber) Date: Tue, 29 Mar 2011 17:57:06 +0200 Subject: [erlang-bugs] Re: UTF8 string handling in different erlang:*** functions In-Reply-To: <4D91FC05.4090802@cs.ntua.gr> References: <201103291506.57110.kruber@zib.de> <4D91EDEE.1070300@di.uminho.pt> <4D91FC05.4090802@cs.ntua.gr> Message-ID: <201103291757.06312.kruber@zib.de> On Tuesday 29 March 2011 17:34:29 you wrote: > Paulo S?rgio Almeida wrote: > > On 3/29/11 3:10 PM, Nico Kruber wrote: > > > > It's not twice but 20 times as fast. Amazing. Even though it should be > > slower, this slower is surprising. > > I have trouble reproducing these numbers, both the 2 and the 20. > With the program at the end of this mail, on an x86_64, I get: > > Eshell V5.8.3 (abort with ^G) > 1> c(t). > {ok,t} > 2> timer:tc(t, t2b, [1000000]). > {133505,ok} > 3> timer:tc(t, c2b, [1000000]). > {636624,ok} > > which makes the term_to_binary version about 4 times as fast on this > machine. On a 32-bit machine the difference is about 6 - 6.5 times. > > Kostis > > %%============================================================== > -module(t). > > -export([t2b/1, c2b/1]). > > -define(S, "some medium sized string here"). > > t2b(N) -> > lists:foreach(fun (_) -> erlang:term_to_binary(?S) end, lists:seq(1,N)). > > c2b(N) -> > lists:foreach(fun (_) -> unicode:characters_to_binary(?S) end, > lists:seq(1,N)). the lists:seq(1,1000000) will additionally slow down the process as it will create the whole list at first -> I used the following loop for my benchmark: %%============================================================== -spec iter(Count::pos_integer(), F::fun(() -> any()), Tag::string()) -> ok. iter(Count, F, Tag) -> F(), Start = erlang:now(), iter_inner(Count, F), Stop = erlang:now(), ElapsedTime = timer:now_diff(Stop, Start) / 1000000.0, Frequency = Count / ElapsedTime, ct:pal("~p iterations of ~p took ~ps: ~p1/s~n", [Count, Tag, ElapsedTime, Frequency]), ok. -spec iter_inner(Count::pos_integer(), F::fun(() -> any())) -> ok. iter_inner(0, _) -> ok; iter_inner(N, F) -> F(), iter_inner(N - 1, F). %%============================================================== regarding 2 vs 20: I simply misread the numbers :( Nico -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 198 bytes Desc: This is a digitally signed message part. URL: From emile@REDACTED Tue Mar 29 19:05:53 2011 From: emile@REDACTED (Emile Joubert) Date: Tue, 29 Mar 2011 18:05:53 +0100 Subject: [erlang-bugs] Re: crypto from windows service? In-Reply-To: References: <4D91C68B.2070702@rabbitmq.com> Message-ID: <4D921171.3010707@rabbitmq.com> Hi Patrik, On 29/03/11 15:08, pan@REDACTED wrote: > Hi! > > I am unable to reproduce the problem, but a wild guess would be that the > openssl libraries (dll's) get messed up in some way by the small > differences in process creation when you connect the stdout/stdin to a > pipe. Have you tried updating openssl on the machine? What happens if I've been able to reproduce the same problem on a separate new 32bit Windows XP SP3 install with R14B02 and Win32 OpenSSL v0.9.8r Light. > you specify debugtype console? Does anything show up in the debug log or Setting -debugtype to console pops up a console when the service starts (which is inconvenient for a service under normal circumstances). Starting the crypto app from the popped up console or from a remotely attached node does work however. The problem in my report only occurs if debugtype is set to new or reuse, or if any stopaction is specified. > in the event viewer when the node crashes? Does the node really crash or > is it only the connection that fails? Nothing useful appears in the OS event viewer after a crash. The node really crashes, not just the connection. It is impossible to establish new connections. Regards Emile From kruber@REDACTED Wed Mar 30 12:04:58 2011 From: kruber@REDACTED (Nico Kruber) Date: Wed, 30 Mar 2011 12:04:58 +0200 Subject: [erlang-bugs] Re: UTF8 string handling in different erlang:*** functions In-Reply-To: References: <201103291506.57110.kruber@zib.de> <201103291757.06312.kruber@zib.de> Message-ID: <201103301205.02403.kruber@zib.de> On Wednesday 30 March 2011 11:47:28 Patrik Nyblom wrote: > Hi! > > To properly measure this, one has to bear in mind that > erlang:term_to_binary() gets evaluated at compile > time, while unicode:characters_to_binary() does not. that's what I was thinking, too, but haven't had time to work around yet > Using this program: > ------------------- > t2bfun() -> > fun(X) -> erlang:term_to_binary(X) end. > c2bfun() -> > fun(X) -> unicode:characters_to_binary(X,unicode) end. > > iter(Count, F, String, Tag) -> > {_,Red0} = erlang:process_info(self(),reductions), > F(String), > {_,Red1} = erlang:process_info(self(),reductions), > io:format("Reductions for one call: ~w~n",[Red1 - Red0]), > Start = erlang:now(), > iter_inner(Count, F, String), > Stop = erlang:now(), > ElapsedTime = timer:now_diff(Stop, Start) / 1000000.0, > Frequency = Count / ElapsedTime, > ct:pal("~p iterations of ~p took ~ps: ~p1/s~n", > [Count, Tag, ElapsedTime, Frequency]), > ok. > > iter_inner(0, _,_) -> > ok; > iter_inner(N, F, String) -> > F(String), > iter_inner(N - 1, F, String). > ------------------ > doing: > ------------------ > 26> > StringWUnicode="jkl?fdsakf??dskfd??saadakfd??sakfd??sjkl?fdsakf??dskfd??saa > dakfd??sakfd??sjkl?fdsakf??dskfd??saadakfd??sakfd??sjkl?fdsakf??dskfd??saad > akfd??sakfd??sjkl?fdsakf??dskfd??saadakfd??sakfd??sjkl?fdsakf??dskfd??saada > kfd??sakfd??s". > "jkl?fdsakf??dskfd??saadakfd??sakfd??sjkl?fdsakf??dskfd??saadakfd??sakfd?? > sjkl?fdsakf??dskfd??saadakfd??sakfd??sjkl?fdsakf??dskfd??saadakfd??sakfd??s > jkl?fdsakf??dskfd??saadakfd??sakfd??sjkl?fdsakf??dskfd??saadakfd??sakfd??s" > 27> t:iter(1000000,t:c2bfun(),StringWUnicode,c2b). > Reductions for one call: 30 > ---------------------------------------------------- > 2011-03-30 11:36:25.808 > 1000000 iterations of c2b took 3.280548s: 304827.120346966431/s > > > ok > 28> t:iter(1000000,t:t2bfun(),StringWUnicode,t2b). > Reductions for one call: 4 > ---------------------------------------------------- > 2011-03-30 11:36:38.837 > 1000000 iterations of t2b took 1.72605s: 579357.49254077231/s > > > ok > 29> KostisString="some medium sized string here". > "some medium sized string here" > 30> t:iter(1000000,t:c2bfun(),KostisString,c2b). > Reductions for one call: 6 > ---------------------------------------------------- > 2011-03-30 11:37:34.952 > 1000000 iterations of c2b took 0.543842s: 1838769.34845046891/s > > > ok > 31> t:iter(1000000,t:t2bfun(),KostisString,t2b). > Reductions for one call: 4 > ---------------------------------------------------- > 2011-03-30 11:37:41.658 > 1000000 iterations of t2b took 0.362457s: 2758947.9579646691/s > > > ok > ----------------------- > - You get more correct measurements, showing a 2 to 3 speedup using > term_to_binary. ----------------------- using these tests, I get a similar result of around 2 speedup: 5> String2 = "qwertzuiopasdfghjklyxcvbnm" ++ [246,252,228,87,224,103,114,97,105,110,32,40,87,229,103,114,335,227,41]. [113,119,101,114,116,122,117,105,111,112,97,115,100,102,103, 104,106,107,108,121,120,99,118,98,110,109,246,252,228|...] 6> t:iter(1000000,t:c2bfun(),String2,c2b). Reductions for one call: 8 ---------------------------------------------------- 2011-03-30 11:56:05.669 1000000 iterations of c2b took 0.701959s: 1424584.6267374591/s ok 7> 7> t:iter(1000000,t:t2bfun(),String2,c2b). Reductions for one call: 4 ---------------------------------------------------- 2011-03-30 11:56:14.630 1000000 iterations of c2b took 1.296981s: 771021.31796842061/s ok ----------------------- (I had to add a character larger than 255 manually as ??? are all below 256 (246, 228, 252) - at least on my platform) > The reasons are many: > 1) unicode:characters_to_binary is a well behaved bif consuming > reductions, which also means that it has to be more elaborate when > allocating, because it may be interrupted. This is more of a problem in > the ancient erlang:term_to_binary bif than one in the unicode bif. > 2) unicode:characters_to_binary does more elaborate range checking, it > only allows *valid* unicode characters, as described in the standard. > 3) unicode:characters_to_binary may need some optimization, but using > gprof, I find no really low hanging fruit. > They are both bleading fast, so unless you plan to do huge amounts of md5 > calculations, my humble opinion is that you should use the one that suits > your problem. no, I'm perfectly fine with unicode:characters_to_binary (if speedup is only at 2) Nico -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 198 bytes Desc: This is a digitally signed message part. URL: From philippu@REDACTED Wed Mar 30 12:26:54 2011 From: philippu@REDACTED (Philipp Unterbrunner) Date: Wed, 30 Mar 2011 12:26:54 +0200 Subject: [erlang-bugs] Re: [erlang-bugs 10] Re: Distributed node crashes silently when initially receiving a big chunk of messages from another node In-Reply-To: References: <4D652468.7000404@inf.ethz.ch> <4D906541.2060506@inf.ethz.ch> Message-ID: <4D93056E.2050608@inf.ethz.ch> I do not have a reasonably small demo yet, but I managed to get some coredumps of beam.smp. The nodes crash with a segfault at hipe_mode_switch.c, line 244 (of R14B02). This is code that is responsible for calling a native code closure. My application code does indeed send a few closures via messages, that are later called by the receiver node. I do not use hot code upgrades however, and the crashes are timing-related, as described before. I therefore suspect the crashes are the result of a race condition involving whatever code is responsible for making a received fun callable. Philipp On 03/29/2011 03:26 PM, pan@REDACTED wrote: > Hi! > > This sounds really bad! A demo application that reproduces the bug > would be really nice. > > Have you tried to enable core dumps to see if the erlang node crashes > with a segfault? I suppose there are no erl_crash.dump files left > after the crash that I can look at either? > > Any way to reproduce it would make it more easy to find! > > Cheers, > /Patrik > > On Mon, 28 Mar 2011, Philipp Unterbrunner wrote: > >> The bug persists in r14b02. >> >> If I find time, I will make a small demo application so that others can >> reproduce the bug. >> >> Philipp >> >> On 02/23/2011 04:14 PM, Philipp Unterbrunner wrote: >>> Hello, >>> >>> I have run into a serious and very annoying bug. >>> >>> Affects (at least); R13B04, R14A, R14B, R14B01 >>> Platform: Ubuntu Linux 10.10, kernel 2.6.35-25-server (SMP) >>> >>> When a newly started distributed node receives a high number of >>> messages from another node, the newly started node crashes silently. >>> Nothing is printed to the console. No crash dump or core dump is >>> produced. >>> >>> In trying to find a work-around, I found the following curious >>> behavior: >>> >>> * The bug *only* occurs for distributed nodes (but regardless of >>> whether the nodes run on the same machine). >>> * Waiting a few seconds (or even longer) before sending the first >>> message to the newly started node does *not* make a difference. The >>> node will still crash when confronted with a large number of >>> incoming messages later. >>> * Speed matters. When doing a debug build, the bug appears less >>> often then when doing a release build, especially when HiPE is >>> enabled. However, I managed to cause the bug even in debug mode, and >>> when OTP was not compiled with native libs. The bug is simply much >>> less likely to be observed. >>> * The number of messages sent *initially* matters most. Slowly >>> "ramping up" the load is a work-around. Once a node is working at >>> high throughput, it is OK to stop sending messages for an arbitrary >>> period and at a later point send a big chunk of messages that would >>> have killed the node if sent initially. >>> * Timing matters. Running the receiver node with +T 7 or higher >>> makes the problem disappear. >>> * Setting the sender node's distribution buffer size to the minimum >>> (+zdbbl 1) makes the problem appear less often. >>> >>> I have reproduced the bug in various applications. The behavior >>> described above also makes it fairly obvious that the application is >>> not at fault. >>> >>> Rather, it appears that the receiver node is unable to buffer >>> incoming messages and crashes. Of particular interest here is the >>> fact that "ramping up" the load is a work-around. I suspect a >>> low-level race condition where the receiver node does not allocate >>> sufficient buffer space in time and crashes. >>> >>> Given that the existing work-arounds are not desirable ("ramp up" >>> requires changes to the application code, +T 7 and +zdbbl 1 decrease >>> performance), and given that the bug now persists over multiple >>> releases, I hope someone can soon look into it. >>> >>> Thank you, >>> >>> Philipp >> -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 262 bytes Desc: OpenPGP digital signature URL: From emile@REDACTED Wed Mar 30 13:35:12 2011 From: emile@REDACTED (Emile Joubert) Date: Wed, 30 Mar 2011 12:35:12 +0100 Subject: [erlang-bugs] Re: crypto from windows service? In-Reply-To: References: <4D91C68B.2070702@rabbitmq.com> <4D921171.3010707@rabbitmq.com> Message-ID: <4D931570.2050203@rabbitmq.com> On 30/03/11 11:00, Patrik Nyblom wrote: > Hi! > > You're absolutely right in that you should not use debugtype console for > production, I was just trying to narrow down the problem. The fact that > console works points to that something get's messed up for > crypto/openssl when the stdout/stdin file descriptors get assigned to > pipes/files... > > A few quest(ion)s: > > Do you get an erl_crash.dump somewhere when it crashes? Do you get nothing > at all in the event viewer, or...? There is no erl_crash.dump in sight. The event viewer contains nothing at the time of the crash. Subsequent attempts to stop the service lead to entries some time after the crash: test: Using TerminateProcess to kill erlang. > The actual debug log, does that contain anything? The debuglog contains nothing beyond the opening banner: Eshell V5.8.3 (abort with ^G) (test@REDACTED)1> > I've not used that particular OpenSSL version, could you try upgrading > to the latest OpenSSL (and also install the redistributables needed)? > Please also uninstall any other OpenSSL versions. WinXP and different > DLL versions when running as a service is known to cause problems I've been able to reproduce the bug using the "Win32 OpenSSL v1.0.0d Light" OpenSSL libraries instead of "Win32 OpenSSL v0.9.8r Light" on 2 separate XP SP3 32bit machines. The problem manifests in an identical manner. One of the machines has Visual Studio installed, for which no redistributable is necessary and the other had the redistributable installed. > unfortunately :( Also please instal the OpenSSL DLL's in the Windows > directory when asked by the installer. This was done in all cases. --- I'd be interested to know what OS, Erlang and OpenSSL version you are using so that I can try that. I've not been able to reproduce the problem under Windows7. So far it appears to be XP-specific. Regards Emile From sverker@REDACTED Wed Mar 30 14:59:14 2011 From: sverker@REDACTED (Sverker Eriksson) Date: Wed, 30 Mar 2011 14:59:14 +0200 Subject: [erlang-bugs] Re: [erlang-bugs 10] Re: Distributed node crashes silently when initially receiving a big chunk of messages from another node In-Reply-To: <4D93056E.2050608@inf.ethz.ch> References: <4D652468.7000404@inf.ethz.ch> <4D906541.2060506@inf.ethz.ch> <4D93056E.2050608@inf.ethz.ch> Message-ID: <4D932922.9000206@erix.ericsson.se> We have one known hipe-bug. I haven't merged it to dev yet, but you can get it from https://github.com/sverker/otp/commit/b715c077a88d5ba68e4e79b32c1c0de087234bbf It's a "minor" heap corruption related to binary matching. Could be worth trying even though we haven't confirmed it as the cause of any faults. /Sverker, Erlang/OTP Philipp Unterbrunner wrote: > I do not have a reasonably small demo yet, but I managed to get some > coredumps of beam.smp. The nodes crash with a segfault at > hipe_mode_switch.c, line 244 (of R14B02). This is code that is > responsible for calling a native code closure. > > My application code does indeed send a few closures via messages, that > are later called by the receiver node. I do not use hot code upgrades > however, and the crashes are timing-related, as described before. I > therefore suspect the crashes are the result of a race condition > involving whatever code is responsible for making a received fun callable. > > Philipp > > > On 03/29/2011 03:26 PM, pan@REDACTED wrote: > >> Hi! >> >> This sounds really bad! A demo application that reproduces the bug >> would be really nice. >> >> Have you tried to enable core dumps to see if the erlang node crashes >> with a segfault? I suppose there are no erl_crash.dump files left >> after the crash that I can look at either? >> >> Any way to reproduce it would make it more easy to find! >> >> Cheers, >> /Patrik >> >> On Mon, 28 Mar 2011, Philipp Unterbrunner wrote: >> >> >>> The bug persists in r14b02. >>> >>> If I find time, I will make a small demo application so that others can >>> reproduce the bug. >>> >>> Philipp >>> >>> On 02/23/2011 04:14 PM, Philipp Unterbrunner wrote: >>> >>>> Hello, >>>> >>>> I have run into a serious and very annoying bug. >>>> >>>> Affects (at least); R13B04, R14A, R14B, R14B01 >>>> Platform: Ubuntu Linux 10.10, kernel 2.6.35-25-server (SMP) >>>> >>>> When a newly started distributed node receives a high number of >>>> messages from another node, the newly started node crashes silently. >>>> Nothing is printed to the console. No crash dump or core dump is >>>> produced. >>>> >>>> In trying to find a work-around, I found the following curious >>>> behavior: >>>> >>>> * The bug *only* occurs for distributed nodes (but regardless of >>>> whether the nodes run on the same machine). >>>> * Waiting a few seconds (or even longer) before sending the first >>>> message to the newly started node does *not* make a difference. The >>>> node will still crash when confronted with a large number of >>>> incoming messages later. >>>> * Speed matters. When doing a debug build, the bug appears less >>>> often then when doing a release build, especially when HiPE is >>>> enabled. However, I managed to cause the bug even in debug mode, and >>>> when OTP was not compiled with native libs. The bug is simply much >>>> less likely to be observed. >>>> * The number of messages sent *initially* matters most. Slowly >>>> "ramping up" the load is a work-around. Once a node is working at >>>> high throughput, it is OK to stop sending messages for an arbitrary >>>> period and at a later point send a big chunk of messages that would >>>> have killed the node if sent initially. >>>> * Timing matters. Running the receiver node with +T 7 or higher >>>> makes the problem disappear. >>>> * Setting the sender node's distribution buffer size to the minimum >>>> (+zdbbl 1) makes the problem appear less often. >>>> >>>> I have reproduced the bug in various applications. The behavior >>>> described above also makes it fairly obvious that the application is >>>> not at fault. >>>> >>>> Rather, it appears that the receiver node is unable to buffer >>>> incoming messages and crashes. Of particular interest here is the >>>> fact that "ramping up" the load is a work-around. I suspect a >>>> low-level race condition where the receiver node does not allocate >>>> sufficient buffer space in time and crashes. >>>> >>>> Given that the existing work-arounds are not desirable ("ramp up" >>>> requires changes to the application code, +T 7 and +zdbbl 1 decrease >>>> performance), and given that the bug now persists over multiple >>>> releases, I hope someone can soon look into it. >>>> >>>> Thank you, >>>> >>>> Philipp >>>> > > > ------------------------------------------------------------------------ > > _______________________________________________ > erlang-bugs mailing list > erlang-bugs@REDACTED > http://erlang.org/mailman/listinfo/erlang-bugs > From emile@REDACTED Wed Mar 30 16:44:35 2011 From: emile@REDACTED (Emile Joubert) Date: Wed, 30 Mar 2011 15:44:35 +0100 Subject: [erlang-bugs] Re: crypto from windows service? In-Reply-To: References: <4D91C68B.2070702@rabbitmq.com> <4D921171.3010707@rabbitmq.com> <4D931570.2050203@rabbitmq.com> Message-ID: <4D9341D3.4000503@rabbitmq.com> Hi Patrick, On 30/03/11 15:04, Patrik Nyblom wrote: > Hi! > > I'm using Windows XP SP3 as well, but using OpenSSL 0.9.7e. If I upgrade > to the latest on WinXP, I get the same symptom as you (Yes!). > > Could you verify that your problems disappear if you use 0.9.7e? I've > attached the installer to this mail. Yes, I can confirm the problems disappear with 0.9.7e. Using the older version of OpenSSL is a reasonable workaround until a better solution is found - thanks for that. Regards Emile From eric.pailleau@REDACTED Thu Mar 31 21:53:59 2011 From: eric.pailleau@REDACTED (PAILLEAU Eric) Date: Thu, 31 Mar 2011 21:53:59 +0200 Subject: [erlang-bugs] Missing init:get_args() function ? Message-ID: <4D94DBD7.2000306@wanadoo.fr> $> erl -sname titi -config toto Erlang R14B02 (erts-5.8.3) [source] [smp:2:2] [rq:2] [async-threads:0] [hipe] [kernel-poll:false] Eshell V5.8.3 (abort with ^G) 1> init:get_args(). ** exception error: undefined function init:get_args/0 while 2> init:get_plain_arguments(). [] Looks like init:get_args() does not exists while still in last documentation. Did I miss something in doc, or is it a bug ? Regards.