From kwidoyo@REDACTED  Tue Mar  1 08:24:28 2011
From: kwidoyo@REDACTED (Kustarto Widoyo)
Date: Tue, 01 Mar 2011 16:24:28 +0900
Subject: [erlang-bugs] Segmentation fault in erts_printf_char (R13B04)
In-Reply-To: <4D4BA635.8050001@geminimobile.com>
References: <4D4BA2F3.2080405@geminimobile.com> <4D4BA635.8050001@geminimobile.com>
Message-ID: <4D6C9F2C.6020606@geminimobile.com>


Could someone help to take a look at this issue?

Regards,
Widoyo

From pan@REDACTED  Tue Mar  1 11:54:28 2011
From: pan@REDACTED (pan@REDACTED)
Date: Tue, 1 Mar 2011 11:54:28 +0100
Subject: [erlang-bugs] Segmentation fault in erts_printf_char (R13B04)
In-Reply-To: <4D4BA2F3.2080405@geminimobile.com>
References: <4D4BA2F3.2080405@geminimobile.com>
Message-ID: <Pine.LNX.4.64.1103011146180.10609@arwen.otp.ericsson.se>

Not much to go on here, could be anything.

Could you supply the rest of the stack, i.e. who calls the print_term 
function?

Also, by using the etp-commands gdb macros (in source tree, 
$ERL_TOP/erts/etc/unix/), you could see the term that's being printed.
Is it corrupted?

I would also try increasing the schedulers stacksize (erl +sss Value), as 
it could be a stack overrun (the parameters to the functions seem ok, but 
that's a wild guess).

/Patrik

On Fri, 4 Feb 2011, Kustarto Widoyo wrote:

> Hi All,
>
> We found that our application was crashed and core dump file was generated, 
> but not found any erl_crash.dump file created.
>
> * The application uses erlang distribution protocol that more than 50 nodes 
> are involving in. All nodes run on Redhat 5.3.
>
> * We're using Erlang R13B04 64 bit built from source and the following 
> patches have been applied:
> 	otp_src_R13B04-OTP-8475.patch
> 	otp_src_R13B04-OTP-8612.patch
> 	otp_src_R13B04-OTP-8643.patch
> 	otp_src_R13B04-OTP-8658.patch
> 	otp_src_R13B04-OTP-8661.patch
> 	otp_src_R13B04-OTP-8662.patch
> 	otp_src_R13B04-beam-break.patch
> 	otp_src_R13B04-emacs.patch
> 	otp_src_R13B04-erl_poll.patch
> 	otp_src_R13B04-erts_de_busy_limit.patch
> 	otp_src_R13B04-eunit.patch
> 	otp_src_R13B04-httpc-memoryleak.patch
> 	otp_src_R13B04-patch-etop.patch
> 	otp_src_R13B04-patch-odbc-oracleworkaround.patch
> 	otp_src_R13B04-supervisor.patch
>
> * The erl command line we use:
> $ /usr/local/gemini/ert/R13B04/lib/erlang/erts-5.7.5/bin/beam.smp -A 64 -K 
> true -S 0 -- -root /usr/local/gemini/ert/R13B04/lib/erlang -progname erl -- 
> -home /export/home/mmssys -- -smp enable -noshell -noinput -noshell -sname 
> gdss1 -kernel net_ticktime 20 -boot 
> /usr/local/gemini/gdss/1.0.0/lib/app/gdss_all -config 
> /usr/local/gemini/gdss/1.0.0/../var/data/node1.config -pa 
> /usr/local/gemini/gdss/1.0.0/lib/app-patches -pz 
> /usr/local/gemini/gdss/1.0.0/lib/app -pz /usr/local/gemini/gdss/1.0.0/lib 
> -central_config /usr/local/gemini/gdss/1.0.0/etc/central.conf 
> -ticket_broker_config /usr/local/gemini/gdss/1.0.0/etc/broker.conf
>
>
> * The following is gdb and backtrace output:
> ------------------------
> [root@REDACTED data]# gdb 
> /usr/local/gemini/ert/R13B04/lib/erlang/erts-5.7.5/bin/beam.smp core.14720
> GNU gdb Fedora (6.8-27.el5)
> Copyright (C) 2008 Free Software Foundation, Inc.
> License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
> This is free software: you are free to change and redistribute it.
> There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
> and "show warranty" for details.
> This GDB was configured as "x86_64-redhat-linux-gnu"...
> Reading symbols from /lib64/libutil.so.1...done.
> Loaded symbols for /lib64/libutil.so.1
> Reading symbols from /lib64/libdl.so.2...done.
> Loaded symbols for /lib64/libdl.so.2
> Reading symbols from /lib64/libm.so.6...done.
> Loaded symbols for /lib64/libm.so.6
> Reading symbols from /usr/lib64/libncurses.so.5...done.
> Loaded symbols for /usr/lib64/libncurses.so.5
> Reading symbols from /lib64/libpthread.so.0...done.
> Loaded symbols for /lib64/libpthread.so.0
> Reading symbols from /lib64/librt.so.1...done.
> Loaded symbols for /lib64/librt.so.1
> Reading symbols from /lib64/libc.so.6...done.
> Loaded symbols for /lib64/libc.so.6
> Reading symbols from /lib64/ld-linux-x86-64.so.2...done.
> Loaded symbols for /lib64/ld-linux-x86-64.so.2
> Reading symbols from 
> /usr/local/gemini/ert/R13B04/lib/erlang/lib/crypto-1.6.4/priv/lib/crypto_drv.so...done.
> Loaded symbols for 
> /usr/local/gemini/ert/R13B04/lib/erlang/lib/crypto-1.6.4/priv/lib/crypto_drv.so
> Reading symbols from 
> /usr/local/gemini/ert/R13B04/openssl/lib/libcrypto.so.0.9.8...done.
> Loaded symbols for 
> /usr/local/gemini/ert/R13B04/openssl/lib/libcrypto.so.0.9.8
> Core was generated by 
> `/usr/local/gemini/ert/R13B04/lib/erlang/erts-5.7.5/bin/beam.smp -A 64 -K 
> true -'.
> Program terminated with signal 11, Segmentation fault.
> [New process 14798]
>
> snip ... snip ... snip ...
>
> #0  0x0000000000585ea8 in erts_printf_char (fn=0x5880f0 <write_ds>, 
> arg=0xb750930, c=91 '[') at common/erl_printf_format.c:840
> 840     common/erl_printf_format.c: No such file or directory.
>        in common/erl_printf_format.c
> (gdb) bt
> #0  0x0000000000585ea8 in erts_printf_char (fn=0x5880f0 <write_ds>, 
> arg=0xb750930, c=91 '[') at common/erl_printf_format.c:840
> #1  0x000000000048c6d8 in print_term (fn=0x5880f0 <write_ds>, arg=0xb750930, 
> obj=5800176, dcount=0x458f2bc8)
>    at beam/erl_printf_term.c:346
> #2  0x000000000048c740 in print_term (fn=0x5880f0 <write_ds>, arg=0xb750930, 
> obj=<value optimized out>, dcount=0x458f2bc8)
>    at beam/erl_printf_term.c:349
> #3  0x000000000048c740 in print_term (fn=0x5880f0 <write_ds>, arg=0xb750930, 
> obj=<value optimized out>, dcount=0x458f2bc8)
>    at beam/erl_printf_term.c:349
>
> snip ... snip ... snip ...
> ------------------------
>
> * And in syslog, we found:
> Jan 27 23:48:21 gds001c kernel: beam.smp[14798]: segfault at 0000000044ef3ff8 
> rip 0000000000585ea8 rsp 0000000044ef4000 error 6
>
> Please let me know, if there is anything else we have to provide.
>
> Regards,
> -- 
> Kustarto Widoyo
> Gemini Mobile Technologies
>
> ________________________________________________________________
> erlang-bugs (at) erlang.org mailing list.
> See http://www.erlang.org/faq.html
> To unsubscribe; mailto:erlang-bugs-unsubscribe@REDACTED
>

From dave@REDACTED  Fri Mar  4 11:53:58 2011
From: dave@REDACTED (Dave Cottlehuber)
Date: Fri, 4 Mar 2011 23:53:58 +1300
Subject: erl.exe dies but werl.exe does not on both Windows XP and 2008R2 with R14B01
Message-ID: <AANLkTi=GyiQQ3eJ9ze9iOKEFYc=HnEcGQqSmFYsVb4SM@mail.gmail.com>

Hallo,

There are 2 issues I've identified - VM crash & VM hang. Both occur
within a CouchDB build of erlang, on various windows variants. This
email covers the crash only. It is easy to reproduce:

Install CouchDB 1.0.1 or a more recent build from
https://github.com/dch/couchdb/downloads and curl.exe from
http://haxx.se/
open command prompt and change to couchdb/bin folder.
set erl=erl
couchdb.bat

& run this script until erlang hangs (watch the erl console scroll
by!). On my 2 testbeds this takes less than a minute to occur - just
25 curls.

::restart_couch.cmd
@echo off
:restart
for /l %%i in (1,1,100000000000) do @call :curl %%i
goto :eof
:curl
curl -v -H "Content-Type: application/json" -X POST
http://localhost:5984/_restart
:: check to see if couch died horribly
if exist erl_crash.dump echo Woops!!!! && move /y erl_crash.dump
erl_crash.dump.%1
goto :eof


=erl_crash_dump:0.1
Fri Mar 04 23:48:06 2011
Slogan: Kernel pid terminated (application_controller)
({application_start_failure,kernel,{shutdown,{kernel,start,[normal,[]]}}})
System version: Erlang R14B01 (erts-5.8.2) [source] [smp:2:2] [rq:2]
[async-threads:4]
Compiled: Sat Feb 12 23:25:21 2011

The full .dump can be found at http://friendpaste.com/1gMbN0i2zn3mlaHAC54D58

1>   [replicator] max_http_sessions="20"
1>   [replicator] ssl_certificate_max_depth="3"
1>   [replicator] verify_ssl_certificates="false"
1>   [stats] rate="1000"
1>   [stats] samples="[0, 60, 300, 900]"
1>   [uuids] algorithm="sequential"
1> Apache CouchDB has started. Time to relax.
1> [info] [<0.689.0>] Apache CouchDB has started on http://0.0.0.0:5984/
1> [debug] [<0.761.0>] 'POST' /_restart {1,1}
Headers: [{'Accept',"*/*"},
          {'Content-Type',"application/json"},
          {'Host',"localhost:5984"},
          {'User-Agent',"curl/7.19.0 (i586-pc-mingw32msvc)
libcurl/7.19.0 OpenSSL/1.0.0c zlib/1.2.3"
}]
1> [debug] [<0.761.0>] OAuth Params: []
1> [info] [<0.761.0>] 127.0.0.1 - - 'POST' /_restart 200
1> {error_logger,{{2011,3,4},{23,48,5}},crash_report,[[{initial_call,{supervisor_bridge,user_sup,['Argument__1']}},{pid,<0.785.0>},{registered_name,[]},{error_info,{exit,nouser,[{gen_server,init_it,6},{proc_lib,init_p_do_apply,3}]}},{ancestors,[kernel_sup,<0.773.0>]},{messages,[]},{links,[<0.774.0>]},{dictionary,[]},{trap_exit,true},{status,running},{heap_size,610},{stack_size,24},{reductions,338}],[]]}
{error_logger,{{2011,3,4},{23,48,5}},supervisor_report,[{supervisor,{local,kernel_sup}},{errorContext,start_error},{reason,nouser},{offender,[{pid,undefined},{name,user},{mfargs,{user_sup,start,[]}},{restart_type,temporary},{shutdown,2000},{child_type,supervisor}]}]}
{error_logger,{{2011,3,4},{23,48,5}},std_info,[{application,kernel},{exited,{shutdown,{kernel,start,[normal,[]]}}},{type,permanent}]}
{"Kernel pid terminated",application_controller,"{application_start_failure,kernel,{shutdown,{kernel,start,[normal,[]]}}}"}

Crash dump was written to: erl_crash.dump
Kernel pid terminated (application_controller)
({application_start_failure,kernel,{shutdown,{kernel,start,[normal,[]]}}})


Can anybody help clarify why this happens, and what we can do about it?

Thanks
Dave

From dave@REDACTED  Fri Mar  4 11:57:27 2011
From: dave@REDACTED (Dave Cottlehuber)
Date: Fri, 4 Mar 2011 23:57:27 +1300
Subject: erl.exe dies but werl.exe does not on both Windows XP and 2008R2
 with R14B01
In-Reply-To: <AANLkTi=GyiQQ3eJ9ze9iOKEFYc=HnEcGQqSmFYsVb4SM@mail.gmail.com>
References: <AANLkTi=GyiQQ3eJ9ze9iOKEFYc=HnEcGQqSmFYsVb4SM@mail.gmail.com>
Message-ID: <AANLkTimeCS--UjxRZaL2aTYNrgz-hsGWYWV4OKdDgtmG@mail.gmail.com>

On 4 March 2011 23:53, Dave Cottlehuber <dave@REDACTED> wrote:
> Hallo,
>
> There are 2 issues I've identified - VM crash & VM hang. Both occur
> within a CouchDB build of erlang, on various windows variants. This
> email covers the crash only. It is easy to reproduce:
>
> Install CouchDB 1.0.1 or a more recent build from
> https://github.com/dch/couchdb/downloads and curl.exe from
> http://haxx.se/
> open command prompt and change to couchdb/bin folder.
> set erl=erl
> couchdb.bat
>
> & run this script until erlang dies (watch the erl console scroll
> by!). On my 2 testbeds this takes less than a minute to occur - just
> 25 curls.

Sorry; key point is that running with werl.exe instead will run
successfully for days.
More info on the original issue is available at
https://issues.apache.org/jira/browse/COUCHDB-963.

Thanks again.
Dave

From dave@REDACTED  Fri Mar  4 12:11:35 2011
From: dave@REDACTED (Dave Cottlehuber)
Date: Sat, 5 Mar 2011 00:11:35 +1300
Subject: erl.exe dies but werl.exe does not on both Windows XP and 2008R2
 with R14B01
In-Reply-To: <AANLkTi=GyiQQ3eJ9ze9iOKEFYc=HnEcGQqSmFYsVb4SM@mail.gmail.com>
References: <AANLkTi=GyiQQ3eJ9ze9iOKEFYc=HnEcGQqSmFYsVb4SM@mail.gmail.com>
Message-ID: <AANLkTinx66Qng0SGvjHgJzfzJ0JZ5EfNMZC1mNKe4gGM@mail.gmail.com>

On 4 March 2011 23:53, Dave Cottlehuber <dave@REDACTED> wrote:
> Hallo,
>
> There are 2 issues I've identified - VM crash & VM hang. Both occur
> within a CouchDB build of erlang, on various windows variants. This
> email covers the crash only. It is easy to reproduce:

The 2nd issue is related to VM hang, under erlsrv service account &
not batch file. I am using following erlsrv configurations; the debug
one provides a visible console so easier to work with but same issue
occurs with either service configuration.

debug: erlsrv.exe add "CouchDeBug" -workdir
"c:\couch\couchdb-1.0.2\bin" -onfail restart_always -debugtype console
-args "-sasl errlog_type error -s couch +A 4 +W w" -comment
"CouchDeBug" -machine "c:\couch\CouchDB-1.0.2\erts-5.8.2\bin\erl.exe"

new: erlsrv.exe add "NewCouch" -workdir "c:\couch\couchdb-1.0.2\bin"
-onfail restart_always -args "-sasl errlog_type error -s couch +A 4 +W
w" -comment "NewCouch" -machine
"c:\couch\CouchDB-1.0.2\erts-5.8.2\bin\erl.exe"

erlsrv starts erl.exe not werl.exe and so the issue noted in previous
email crops up and the application is down.

2 questions -

Why does erlsrv not use werl.exe ? it is technically possible to pass
erlsrv the -machine parameter with werl.exe, and it runs successfully
as a service avoiding the original issue. The erlang/OTP source
confirms that this is bad - but why?

2nd part. using erlsrv "-restart_always" couchdb restarts very
quickly. but after up to 8h of continuous curl _restart, the erl.exe
window definitely hangs - no input accepted - as if the REPL loop is
over.

Any ideas why erlang seems to hang around init:restart() and what we
can do about it? Do you want any more information?
Thanks
Dave

From magnus.henoch@REDACTED  Fri Mar  4 15:39:15 2011
From: magnus.henoch@REDACTED (Magnus Henoch)
Date: Fri, 4 Mar 2011 14:39:15 +0000 (GMT)
Subject: Can't run mnesia:first on empty fragmented table
In-Reply-To: <2056185093.24901299249121046.JavaMail.root@zimbra>
Message-ID: <642444943.25071299249555440.JavaMail.root@zimbra>

Hi all,

When I run mnesia:first on an empty fragmented table, it tries to
access the fragment with the number one beyond the maximum.  In the
sample code below, I create a table with two fragments, 'foo' and
'foo_frag2', but mnesia tries to access 'foo_frag3':

-module(foo).

-compile(export_all).

foo() ->
    net_kernel:start([foo, shortnames]),
    application:start(mnesia),
    {atomic, ok} = mnesia:create_table(foo, []),
    %% activate fragmentation
    {atomic, ok} = mnesia:change_table_frag(foo, {activate, []}),
    %% add a second fragment on this node
    {atomic, ok} = mnesia:change_table_frag(foo, {add_frag, [node()]}),
    
    io:format("Our table is fragmented:~n~p~n", [mnesia:table_info(foo, all)]),
    
    io:format("Now let's run mnesia:first.  We expect to get ~p.~n~p~n",
              ['$end_of_table',
               %% but we get {'EXIT',{aborted,{no_exists,[foo_frag3]}}}
               catch mnesia:activity(sync_dirty,
                                 fun() -> mnesia:first(foo) end,
                                 [],
                                 mnesia_frag)]).

It looks like a simple off-by-one error in mnesia_frag:search_first.
Changing the guard from '=<' to '<' as in the patch below fixes my
test case (and the real system I distilled it from), but I'd
appreciate a second opinion.

Regards,
Magnus


diff --git a/lib/mnesia/src/mnesia_frag.erl b/lib/mnesia/src/mnesia_frag.erl
index a2958ab..d33dafe 100644
--- a/lib/mnesia/src/mnesia_frag.erl
+++ b/lib/mnesia/src/mnesia_frag.erl
@@ -209,7 +209,7 @@ first(ActivityId, Opaque, Tab) ->
 	    end
     end.
 
-search_first(ActivityId, Opaque, Tab, N, FH) when N =< FH#frag_state.n_fragments ->
+search_first(ActivityId, Opaque, Tab, N, FH) when N < FH#frag_state.n_fragments ->
     NextN = N + 1,
     NextFrag = n_to_frag_name(Tab, NextN),
     case mnesia:first(ActivityId, Opaque, NextFrag) of

From pan@REDACTED  Fri Mar  4 16:41:03 2011
From: pan@REDACTED (pan@REDACTED)
Date: Fri, 4 Mar 2011 16:41:03 +0100
Subject: [erlang-bugs] Re: erl.exe dies but werl.exe does not on both
 Windows XP and 2008R2 with R14B01
In-Reply-To: <AANLkTinx66Qng0SGvjHgJzfzJ0JZ5EfNMZC1mNKe4gGM@mail.gmail.com>
References: <AANLkTi=GyiQQ3eJ9ze9iOKEFYc=HnEcGQqSmFYsVb4SM@mail.gmail.com>
 <AANLkTinx66Qng0SGvjHgJzfzJ0JZ5EfNMZC1mNKe4gGM@mail.gmail.com>
Message-ID: <Pine.LNX.4.64.1103041626340.10609@arwen.otp.ericsson.se>

Hi,

On Sat, 5 Mar 2011, Dave Cottlehuber wrote:

> On 4 March 2011 23:53, Dave Cottlehuber <dave@REDACTED> wrote:
>> Hallo,
>>
>> There are 2 issues I've identified - VM crash & VM hang. Both occur
>> within a CouchDB build of erlang, on various windows variants. This
>> email covers the crash only. It is easy to reproduce:
>
> The 2nd issue is related to VM hang, under erlsrv service account &
> not batch file. I am using following erlsrv configurations; the debug
> one provides a visible console so easier to work with but same issue
> occurs with either service configuration.
>
> debug: erlsrv.exe add "CouchDeBug" -workdir
> "c:\couch\couchdb-1.0.2\bin" -onfail restart_always -debugtype console
> -args "-sasl errlog_type error -s couch +A 4 +W w" -comment
> "CouchDeBug" -machine "c:\couch\CouchDB-1.0.2\erts-5.8.2\bin\erl.exe"
>
> new: erlsrv.exe add "NewCouch" -workdir "c:\couch\couchdb-1.0.2\bin"
> -onfail restart_always -args "-sasl errlog_type error -s couch +A 4 +W
> w" -comment "NewCouch" -machine
> "c:\couch\CouchDB-1.0.2\erts-5.8.2\bin\erl.exe"
>
> erlsrv starts erl.exe not werl.exe and so the issue noted in previous
> email crops up and the application is down.

I think it may be the same issue. We're investigating the batch file issue 
to start with. The problem is easy to reproduce - very nice.

>
> 2 questions -
>
> Why does erlsrv not use werl.exe ? it is technically possible to pass
> erlsrv the -machine parameter with werl.exe, and it runs successfully
> as a service avoiding the original issue. The erlang/OTP source
> confirms that this is bad - but why?

Werl cannot handle everything the erlsrv program wants to do to the 
machine, like stopactions, killing by signalling etc. Fixing erl so it 
does not hang is the easiest and best thing to do here.

>
> 2nd part. using erlsrv "-restart_always" couchdb restarts very
> quickly. but after up to 8h of continuous curl _restart, the erl.exe
> window definitely hangs - no input accepted - as if the REPL loop is
> over.
>
> Any ideas why erlang seems to hang around init:restart() and what we
> can do about it? Do you want any more information?

When running erl on Windows you get the "old shell", meaning that another 
io-server is running and also a special driver (the fd-driver) is used. 
The fd-driver is emulatong Unix behaviour and might be the cause of all 
the problems, but the actual user.erl code might also be broken. I'll 
debug it, find the first problem and get back to you when I've narrowed it 
down!


> Thanks
> Dave

Cheers,
/Patrik

>
> ________________________________________________________________
> erlang-bugs (at) erlang.org mailing list.
> See http://www.erlang.org/faq.html
> To unsubscribe; mailto:erlang-bugs-unsubscribe@REDACTED
>

From dangud@REDACTED  Mon Mar  7 16:07:49 2011
From: dangud@REDACTED (Dan Gudmundsson)
Date: Mon, 7 Mar 2011 16:07:49 +0100
Subject: [erlang-bugs] Can't run mnesia:first on empty fragmented table
In-Reply-To: <642444943.25071299249555440.JavaMail.root@zimbra>
References: <2056185093.24901299249121046.JavaMail.root@zimbra>
	<642444943.25071299249555440.JavaMail.root@zimbra>
Message-ID: <AANLkTim2aGEs-KVBf3g3_oWTopogB4jKwKsaU9cfbgrJ@mail.gmail.com>

Looks correct to me.

I will include it directly.

/Dan

On Fri, Mar 4, 2011 at 3:39 PM, Magnus Henoch
<magnus.henoch@REDACTED> wrote:
> Hi all,
>
> When I run mnesia:first on an empty fragmented table, it tries to
> access the fragment with the number one beyond the maximum. ?In the
> sample code below, I create a table with two fragments, 'foo' and
> 'foo_frag2', but mnesia tries to access 'foo_frag3':
>
> -module(foo).
>
> -compile(export_all).
>
> foo() ->
> ? ?net_kernel:start([foo, shortnames]),
> ? ?application:start(mnesia),
> ? ?{atomic, ok} = mnesia:create_table(foo, []),
> ? ?%% activate fragmentation
> ? ?{atomic, ok} = mnesia:change_table_frag(foo, {activate, []}),
> ? ?%% add a second fragment on this node
> ? ?{atomic, ok} = mnesia:change_table_frag(foo, {add_frag, [node()]}),
>
> ? ?io:format("Our table is fragmented:~n~p~n", [mnesia:table_info(foo, all)]),
>
> ? ?io:format("Now let's run mnesia:first. ?We expect to get ~p.~n~p~n",
> ? ? ? ? ? ? ?['$end_of_table',
> ? ? ? ? ? ? ? %% but we get {'EXIT',{aborted,{no_exists,[foo_frag3]}}}
> ? ? ? ? ? ? ? catch mnesia:activity(sync_dirty,
> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? fun() -> mnesia:first(foo) end,
> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? [],
> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? mnesia_frag)]).
>
> It looks like a simple off-by-one error in mnesia_frag:search_first.
> Changing the guard from '=<' to '<' as in the patch below fixes my
> test case (and the real system I distilled it from), but I'd
> appreciate a second opinion.
>
> Regards,
> Magnus
>
>
> diff --git a/lib/mnesia/src/mnesia_frag.erl b/lib/mnesia/src/mnesia_frag.erl
> index a2958ab..d33dafe 100644
> --- a/lib/mnesia/src/mnesia_frag.erl
> +++ b/lib/mnesia/src/mnesia_frag.erl
> @@ -209,7 +209,7 @@ first(ActivityId, Opaque, Tab) ->
> ? ? ? ? ? ?end
> ? ? end.
>
> -search_first(ActivityId, Opaque, Tab, N, FH) when N =< FH#frag_state.n_fragments ->
> +search_first(ActivityId, Opaque, Tab, N, FH) when N < FH#frag_state.n_fragments ->
> ? ? NextN = N + 1,
> ? ? NextFrag = n_to_frag_name(Tab, NextN),
> ? ? case mnesia:first(ActivityId, Opaque, NextFrag) of
>
> ________________________________________________________________
> erlang-bugs (at) erlang.org mailing list.
> See http://www.erlang.org/faq.html
> To unsubscribe; mailto:erlang-bugs-unsubscribe@REDACTED
>
>

From kwidoyo@REDACTED  Tue Mar  8 09:13:56 2011
From: kwidoyo@REDACTED (Kustarto Widoyo)
Date: Tue, 08 Mar 2011 17:13:56 +0900
Subject: [erlang-bugs] Segmentation fault in erts_printf_char (R13B04)
In-Reply-To: <Pine.LNX.4.64.1103011146180.10609@arwen.otp.ericsson.se>
References: <4D4BA2F3.2080405@geminimobile.com> <Pine.LNX.4.64.1103011146180.10609@arwen.otp.ericsson.se>
Message-ID: <4D75E544.6080209@geminimobile.com>

 > Could you supply the rest of the stack, i.e. who calls the print_term
 > function?

The following is about the rest of the stack?(result of thread apply 
all bt).
...
Thread 1 (process 14798):
#0  0x0000000000585ea8 in erts_printf_char (fn=0x5880f0 <write_ds>,
     arg=0xb750930, c=91 '[') at common/erl_printf_format.c:840
#1  0x000000000048c6d8 in print_term (fn=0x5880f0 <write_ds>, arg=0xb750930,
     obj=5800176, dcount=0x458f2bc8) at beam/erl_printf_term.c:346
#2  0x000000000048c740 in print_term (fn=0x5880f0 <write_ds>, arg=0xb750930,
     obj=<value optimized out>, dcount=0x458f2bc8) at 
beam/erl_printf_term.c:349
#3  0x000000000048c740 in print_term (fn=0x5880f0 <write_ds>, arg=0xb750930,
     obj=<value optimized out>, dcount=0x458f2bc8) at 
beam/erl_printf_term.c:349
#4  0x000000000048c740 in print_term (fn=0x5880f0 <write_ds>, arg=0xb750930,
     obj=<value optimized out>, dcount=0x458f2bc8) at 
beam/erl_printf_term.c:349

....snip....

#43669 0x000000000048c740 in print_term (fn=0x5880f0 <write_ds>,
     arg=0xb750930, obj=<value optimized out>, dcount=0x458f2bc8)
     at beam/erl_printf_term.c:349
#43670 0x000000000048cf48 in erts_printf_term (fn=0xb750930, arg=0x44ef4004,
     term=1, precision=56331) at beam/erl_printf_term.c:452
#43671 0x0000000000586b9c in erts_printf_format (fn=0x5880f0 <write_ds>,
     arg=0xb750930, fmt=<value optimized out>, ap=0x458f2cd0)
     at common/erl_printf_format.c:813
#43672 0x0000000000587df4 in erts_vdsprintf (dsbufp=0xb750930,
     format=0x591897 "   %T\n", arglist=0x458f2cd0) at 
common/erl_printf.c:419
#43673 0x0000000000470fe0 in erts_print (to=<value optimized out>, arg=0x1,
     format=0x1c2510 <Address 0x1c2510 out of bounds>) at beam/utils.c:299
#43674 0x00000000004921a3 in erts_program_counter_info (to=-4,
     to_arg=0xb750930, p=0x2aaabb0ae630) at beam/erl_process.c:8387
#43675 0x00000000004921f5 in erts_stack_dump (to=-4, to_arg=0xb750930, 
p=0x1)
     at beam/erl_process.c:8360
#43676 0x00000000004e9191 in print_process_info (to=-4, to_arg=0xb750930,
     p=0x2aaabb0ae630) at beam/break.c:343
#43677 0x00000000004e9384 in process_info (to=-4, to_arg=0xb750930)
     at beam/break.c:79
#43678 0x0000000000458744 in system_info_1 (A__p=0x2aaabb0b6ca0, A_1=24459)
     at beam/erl_bif_info.c:1973
#43679 0x000000000051279a in process_main () at beam/beam_emu.c:2087
#43680 0x000000000049fac2 in sched_thread_func (vesdp=<value optimized out>)
     at beam/erl_process.c:3060
#43681 0x0000000000585314 in thr_wrapper (vtwd=<value optimized out>)
     at common/ethread.c:480
#43682 0x0000003383006367 in start_thread () from /lib64/libpthread.so.0
#43683 0x00000033824d309d in clone () from /lib64/libc.so.6

 > Also, by using the etp-commands gdb macros (in source tree,
 > $ERL_TOP/erts/etc/unix/), you could see the term that's being printed.
 > Is it corrupted?

Sorry, I am still not able to use it.

Thanks,
Widoyo


From pan@REDACTED  Tue Mar  8 14:01:32 2011
From: pan@REDACTED (pan@REDACTED)
Date: Tue, 8 Mar 2011 14:01:32 +0100
Subject: [erlang-bugs] Segmentation fault in erts_printf_char (R13B04)
In-Reply-To: <4D75E41C.1080502@geminimobile.com>
References: <4D4BA2F3.2080405@geminimobile.com>
 <Pine.LNX.4.64.1103011146180.10609@arwen.otp.ericsson.se>
 <4D75E41C.1080502@geminimobile.com>
Message-ID: <Pine.LNX.4.64.1103081357430.10609@arwen.otp.ericsson.se>

Hi!

Interesting, seems that the system_info bif barfs... It would be 
interesting to see a printout of the parameters to system_info_1, the 
first parameter should be a pointer to a process structure and the second 
an erlang term (you print it with etp A__1).

Do you have any possibility to provide me with the core and your build of 
the VM, together with the source?

Cheers,
/Patrik

On Tue, 8 Mar 2011, Kustarto Widoyo wrote:

>
>> Could you supply the rest of the stack, i.e. who calls the print_term
>> function?
>
> Please find the attached file. It's about the rest of the stack.
>
> Thread 1 (process 14798):
> #0  0x0000000000585ea8 in erts_printf_char (fn=0x5880f0 <write_ds>,
>    arg=0xb750930, c=91 '[') at common/erl_printf_format.c:840
> #1  0x000000000048c6d8 in print_term (fn=0x5880f0 <write_ds>, arg=0xb750930,
>    obj=5800176, dcount=0x458f2bc8) at beam/erl_printf_term.c:346
> #2  0x000000000048c740 in print_term (fn=0x5880f0 <write_ds>, arg=0xb750930,
>    obj=<value optimized out>, dcount=0x458f2bc8) at 
> beam/erl_printf_term.c:349
> #3  0x000000000048c740 in print_term (fn=0x5880f0 <write_ds>, arg=0xb750930,
>    obj=<value optimized out>, dcount=0x458f2bc8) at 
> beam/erl_printf_term.c:349
> #4  0x000000000048c740 in print_term (fn=0x5880f0 <write_ds>, arg=0xb750930,
>    obj=<value optimized out>, dcount=0x458f2bc8) at 
> beam/erl_printf_term.c:349
>
> ....snip....
>
> #43669 0x000000000048c740 in print_term (fn=0x5880f0 <write_ds>,
>    arg=0xb750930, obj=<value optimized out>, dcount=0x458f2bc8)
>    at beam/erl_printf_term.c:349
> #43670 0x000000000048cf48 in erts_printf_term (fn=0xb750930, arg=0x44ef4004,
>    term=1, precision=56331) at beam/erl_printf_term.c:452
> #43671 0x0000000000586b9c in erts_printf_format (fn=0x5880f0 <write_ds>,
>    arg=0xb750930, fmt=<value optimized out>, ap=0x458f2cd0)
>    at common/erl_printf_format.c:813
> #43672 0x0000000000587df4 in erts_vdsprintf (dsbufp=0xb750930,
>    format=0x591897 "   %T\n", arglist=0x458f2cd0) at common/erl_printf.c:419
> #43673 0x0000000000470fe0 in erts_print (to=<value optimized out>, arg=0x1,
>    format=0x1c2510 <Address 0x1c2510 out of bounds>) at beam/utils.c:299
> #43674 0x00000000004921a3 in erts_program_counter_info (to=-4,
>    to_arg=0xb750930, p=0x2aaabb0ae630) at beam/erl_process.c:8387
> #43675 0x00000000004921f5 in erts_stack_dump (to=-4, to_arg=0xb750930, p=0x1)
>    at beam/erl_process.c:8360
> #43676 0x00000000004e9191 in print_process_info (to=-4, to_arg=0xb750930,
>    p=0x2aaabb0ae630) at beam/break.c:343
> #43677 0x00000000004e9384 in process_info (to=-4, to_arg=0xb750930)
>    at beam/break.c:79
> #43678 0x0000000000458744 in system_info_1 (A__p=0x2aaabb0b6ca0, A_1=24459)
>    at beam/erl_bif_info.c:1973
> #43679 0x000000000051279a in process_main () at beam/beam_emu.c:2087
> #43680 0x000000000049fac2 in sched_thread_func (vesdp=<value optimized out>)
>    at beam/erl_process.c:3060
> #43681 0x0000000000585314 in thr_wrapper (vtwd=<value optimized out>)
>    at common/ethread.c:480
> #43682 0x0000003383006367 in start_thread () from /lib64/libpthread.so.0
> #43683 0x00000033824d309d in clone () from /lib64/libc.so.6
>
>> Also, by using the etp-commands gdb macros (in source tree,
>> $ERL_TOP/erts/etc/unix/), you could see the term that's being printed.
>> Is it corrupted?
>
> Sorry, I am still not able to use it.
>
> Thanks,
> Widoyo
>
>
>

From gopienko@REDACTED  Thu Mar 10 08:39:49 2011
From: gopienko@REDACTED (Andrew Gopienko)
Date: Thu, 10 Mar 2011 13:39:49 +0600
Subject: Relup instructions order
Message-ID: <AANLkTi=cjiw+=OMGjhv5KvKVjCTPrqfJBjBF8A6joZTa@mail.gmail.com>

Erlang R14B01, Ubuntu 10.10 x86 box

gproc.appup file
----------------------------------------------------
%% appup generated for gproc by rebar ("2011/03/10 13:15:09")
{"0.1.1",
   [{"0.01", [{delete_module,gproc_eqc},{update,gproc,{advanced,[]}}]}],
   [{"0.01", []}]
}.

relup file
-------------------------------------------------------
{"0.5.5",
 [{"0.5.4",[],
   [{load_object_code,{gproc,"0.1.1",[gproc]}},
    point_of_no_return,
    {remove,{gproc_eqc,brutal_purge,brutal_purge}},
    {purge,[gproc_eqc]},
    {suspend,[gproc]},
    {load,{gproc,brutal_purge,brutal_purge}},
    {code_change,up,[{gproc,[]}]},
    {resume,[gproc]}]}],
 [{"0.5.4",[],[point_of_no_return]}]}.

After evaluating instruction 'point_of_no_return' the library path updated
to new location and
instruction 'remove' crashed in call
release_handler_1:get_vsn(non_existing=code:which(gproc_eqc)).


release_handler:upgrade_app also crash with same Reason.

(flm@REDACTED)1> code:which(gproc_eqc).
"/home/tdx/devel/fleetm/rel/flm_0.5.4/lib/gproc-0.01/ebin/gproc_eqc.beam"

(flm@REDACTED)2> release_handler:upgrade_app(gproc,
"../flm/lib/gproc-0.1.1").
{'EXIT',{'EXIT',{{badmatch,{error,beam_lib,
                                  {file_error,"non_existing.beam",enoent}}},
                 [{release_handler_1,get_vsn,1},
                  {release_handler_1,add_old_vsn,2},
                  {release_handler_1,eval,2},
                  {lists,foldl,3},
                  {release_handler_1,eval_script,4},
                  {release_handler,eval_appup_script,4},
                  {erl_eval,do_apply,5},
                  {shell,exprs,7}]}}}

(flm@REDACTED)3> code:which(gproc_eqc).
non_existing

After reordering instructions in relup and place 'remove' before
'point_of_no_return' release upgraded successfully.

From bernie@REDACTED  Fri Mar 11 00:16:39 2011
From: bernie@REDACTED (Bernard Duggan)
Date: Fri, 11 Mar 2011 10:16:39 +1100
Subject: Variable incorrectly unbound when bound and used in binary match
Message-ID: <4D795BD7.4010708@m5net.com>

Reposting this (with slight modifications) that was posed in the questions list.  It seemed presumptuous to go straight to the bugs list, but the consensus seems to be that that's what this is:

So I've just run into an interesting little bit of behaviour that
doesn't seem quite right.  In the following code:
-----------------------------------------
-module(casetest).

-export([test/0]).

test() ->
      match(<<1, 2, 3, 4, 5, 6, 7, 8>>).

match(<<A:1/binary, B:8/integer, _C:B/binary, _Rest/binary>>) ->
      case A of
          B ->  wrong;
          _ ->  ok
      end.
-----------------------------------------
erlc gives me the warning

./casetest.erl:11: Warning: this clause cannot match because a previous
clause at line 10 always matches

(line 10 is the "B ->  wrong;" line).

And sure enough, if you run test/0 you get 'wrong' back.

That, in itself, is curious to me since by my understanding B should be
bound by the function header, and have no guarantee of being the same as
A.  I can't see how it could be unbound.

Doubly curious, is that if I stop using B as the size specifier of C,
like this:

match(<<A:1/binary, B:8/integer, _C:1/binary, _Rest/binary>>) ->

The warning goes away.  And the result becomes 'ok' (in spite of nothing
in the body having changed, and the only thing changing in the header
being the size of an unused variable at the tail of the binary).
Similarly, if I change the body of match/1 to this:

      Z = B,
      case A of
          Z ->  wrong;
          _ ->  ok
      end.

It also works.

So, yeah, it kinda looks like a bug.

Cheers,

Bernard


From g@REDACTED  Fri Mar 11 23:30:00 2011
From: g@REDACTED (Garrett Smith)
Date: Fri, 11 Mar 2011 16:30:00 -0600
Subject: Missing ssl_certificate_key_file in docs
Message-ID: <AANLkTik6tQUMNKWiaCxYD+7ubqREJWPeGzKc+jBJYzpy@mail.gmail.com>

In http://www.erlang.org/doc/man/httpd.html there's no mention of
ssl_certificate_key_file, which is required for SSL support.

Unless there's another way to configure the key, that should also be a
required option when ssl is specified (as is the case for
ssl_certificate_key_file).

Garrett

From g@REDACTED  Fri Mar 11 23:39:11 2011
From: g@REDACTED (Garrett Smith)
Date: Fri, 11 Mar 2011 16:39:11 -0600
Subject: Missing ssl_certificate_key_file in docs
In-Reply-To: <AANLkTik6tQUMNKWiaCxYD+7ubqREJWPeGzKc+jBJYzpy@mail.gmail.com>
References: <AANLkTik6tQUMNKWiaCxYD+7ubqREJWPeGzKc+jBJYzpy@mail.gmail.com>
Message-ID: <AANLkTi=VyFU=O8smX0ou0eVFX5jgB+OMydEQhfUn_OBP@mail.gmail.com>

On Fri, Mar 11, 2011 at 4:30 PM, Garrett Smith <g@REDACTED> wrote:
> In http://www.erlang.org/doc/man/httpd.html there's no mention of
> ssl_certificate_key_file, which is required for SSL support.
>
> Unless there's another way to configure the key, that should also be a
> required option when ssl is specified (as is the case for
> ssl_certificate_key_file).

Er, as is the case for ssl_certificate_file.

From g@REDACTED  Sat Mar 12 04:01:45 2011
From: g@REDACTED (Garrett Smith)
Date: Fri, 11 Mar 2011 21:01:45 -0600
Subject: security_directory docs incorrect
Message-ID: <AANLkTin69TB5tkbRXaUOwTfy=qUY2gBY0u_qGRjMxm+o@mail.gmail.com>

In http://www.erlang.org/doc/man/httpd.html, the docs for security
directory properties look like this:

{security_data_file, path()}...
{security_max_retries, integer()}...
{security_block_time, integer()}...
{security_fail_expire_time, integer()}...
{security_auth_timeout, integer()}...

The values actually used are:

{data_file, path()}...
{max_retries, integer()}...
{block_time, integer()}...
{fail_expire_time, integer()}
{auth_timeout, integer()}...

See to mod_security.erl and mod_security_server.erl.

Garrett

From g@REDACTED  Sat Mar 12 04:24:11 2011
From: g@REDACTED (Garrett Smith)
Date: Fri, 11 Mar 2011 21:24:11 -0600
Subject: Undocumented path property required in security_directory
Message-ID: <AANLkTi=Fd-5WTgwwDY52GXcy2kjQXnf+DcfnyEpgHbVd@mail.gmail.com>

mod_security relies on a 'path' property in a security_directory. If
this property isn't available, you can't unblock blocked users.

Refer to line 453 of mod_security_server.erl.

E.g. a config like this:

  {security_directory,
                  {"/",
                   [{data_file, "security.dets"}]}},

will list undefined for the directory in the list of blocked users:

[{"me",any,8080,undefined,{{2011,3,11},{22,8,3}}}]

Config like this:

{security_directory,
                  {"/",
                   [{path, "/"},
                    {data_file, "security.dets"}]}},

however, will work fine.

I suspect that 'path' was supposed to be added implicitly to the
DataDir proplist during the 'store' operation in mod_security, rather
than require the user to explicitly configure it as in my example.

Garrett

From g@REDACTED  Sat Mar 12 19:00:52 2011
From: g@REDACTED (Garrett Smith)
Date: Sat, 12 Mar 2011 12:00:52 -0600
Subject: httpd.hrl not in canonical location
Message-ID: <AANLkTim=c+ECunOZH3-yLvCQJTqKfNcmJ5W4o7F6Mu3J@mail.gmail.com>

The inets application doesn't have the canonical 'include' directory.
Public include files like httpd.hrl are located under 'src'.

This breaks typical usage:

-include_lib("inets/include/httpd.hrl").

See docs in http://www.erlang.org/doc/man/httpd.html.

Garrett

From stonecypher@REDACTED  Sun Mar 13 17:44:47 2011
From: stonecypher@REDACTED (John Haugeland)
Date: Mon, 14 Mar 2011 00:44:47 +0800
Subject: hi !
Message-ID: <AANLkTimyv-qz=+pQHv=TyndSk0H1FzAUM0wj7ORO9Zc3@mail.gmail.com>

Hi,what is up?
my friend that works at chinese electronic corporation called
"VIP-SHIMAO" tell me: their company is carrying out a promotion
activity,
there are phone,notebook,LCD TV and so on,not only all of goods are
new and original ,but also the price so cheaper,they have received
from customers high praise in the worldwide,
today i get the digital camera that ordered,their service team are so
excelent,shiping time take less than one week,i am very satisfied with
their goods and service.
now i share with you the good news,i believe that you can find what
you need or like there : www.vip-shimao.info
i trust that you will not despair and will get surprising,wish
shopping happily!!

From bgustavsson@REDACTED  Mon Mar 14 17:03:19 2011
From: bgustavsson@REDACTED (=?UTF-8?Q?Bj=C3=B6rn_Gustavsson?=)
Date: Mon, 14 Mar 2011 17:03:19 +0100
Subject: [erlang-bugs] Variable incorrectly unbound when bound and used in
 binary match
In-Reply-To: <4D795BD7.4010708@m5net.com>
References: <4D795BD7.4010708@m5net.com>
Message-ID: <AANLkTi=TND4eHSjbKKrG7naafW7=FhBm2fLk0-=Gve4O@mail.gmail.com>

On Fri, Mar 11, 2011 at 12:16 AM, Bernard Duggan <bernie@REDACTED> wrote:
> Reposting this (with slight modifications) that was posed in the questions
> list.  It seemed presumptuous to go straight to the bugs list, but the
> consensus seems to be that that's what this is:
>
> So I've just run into an interesting little bit of behaviour that
> doesn't seem quite right.  In the following code:
> -----------------------------------------
> -module(casetest).
>
> -export([test/0]).
>
> test() ->
>     match(<<1, 2, 3, 4, 5, 6, 7, 8>>).
>
> match(<<A:1/binary, B:8/integer, _C:B/binary, _Rest/binary>>) ->
>     case A of
>         B ->  wrong;
>         _ ->  ok
>     end.
[...]

Thanks for reporting this bug.

It is indeed a bug in the handling of
variables in binary matching.

It is too late to fix the bug in R14B02,
so we will fix it in R14B03.

-- 
Bj?rn Gustavsson, Erlang/OTP, Ericsson AB

From igor@REDACTED  Mon Mar 14 21:41:41 2011
From: igor@REDACTED (Igor Goryachev)
Date: Mon, 14 Mar 2011 23:41:41 +0300
Subject: segmentation fault in tree_delete at beam/erl_bestfit_alloc.c:431
Message-ID: <871v29a0ze.fsf@goryachev.org>

Hello.

We are suffering of quite frequent segmentation faults on our erlangish
environment. We run r14b01 node with a very small load on linux 2.6.32
(Debian GNU/Linux Squeeze 6.0), which is virtual machine hosted under
OpenVZ hypervisor (16 cores, Xeon 2.40GHz).

I've tried to rebuild erlang with and without smp and threads, but in any
case I'm getting the same behaviour.

What additional helpful information should I provide?


Core was generated by `/usr/lib/erlang/erts-5.8.2/bin/beam -K true -- -root /usr/lib/erlang -progname'.
Program terminated with signal 11, Segmentation fault.
#0  0x0000000000437e83 in tree_delete (allctr=0x7cbf20, del=0x7f93a8267460) at beam/erl_bestfit_alloc.c:431
431     beam/erl_bestfit_alloc.c: No such file or directory.
        in beam/erl_bestfit_alloc.c
(gdb) where
#0  0x0000000000437e83 in tree_delete (allctr=0x7cbf20, del=0x7f93a8267460) at beam/erl_bestfit_alloc.c:431
#1  0x0000000000438bb2 in bf_unlink_free_block (allctr=0x7cbf20, size=<value optimized out>, cand_blk=<value optimized out>, 
    cand_size=0) at beam/erl_bestfit_alloc.c:791
#2  bf_get_free_block (allctr=0x7cbf20, size=<value optimized out>, cand_blk=<value optimized out>, cand_size=0)
    at beam/erl_bestfit_alloc.c:842
#3  0x0000000000433506 in mbc_alloc_block (allctr=0x7cbf20, size=287) at beam/erl_alloc_util.c:631
#4  mbc_alloc (allctr=0x7cbf20, size=287) at beam/erl_alloc_util.c:764
#5  0x00000000004b8118 in erts_alloc (c_p=0x7f93a70a90e0, reg=<value optimized out>, live=<value optimized out>, 
    build_size_term=<value optimized out>, extra_words=140272158101112, unit=8) at beam/erl_alloc.h:184
#6  erts_bin_nrml_alloc (c_p=0x7f93a70a90e0, reg=<value optimized out>, live=<value optimized out>, 
    build_size_term=<value optimized out>, extra_words=140272158101112, unit=8) at beam/erl_binary.h:253
#7  erts_bs_append (c_p=0x7f93a70a90e0, reg=<value optimized out>, live=<value optimized out>, build_size_term=<value optimized out>, 
    extra_words=140272158101112, unit=8) at beam/erl_bits.c:1325
#8  0x00000000004e0a02 in process_main () at beam/beam_emu.c:3624
#9  0x000000000043c5eb in erl_start (argc=33, argv=<value optimized out>) at beam/erl_init.c:1443
#10 0x0000000000427ac9 in main (argc=8175392, argv=0x7f93a8267460) at sys/unix/erl_main.c:29


-- 
Igor Goryachev

From elinsn@REDACTED  Tue Mar 15 08:20:26 2011
From: elinsn@REDACTED (Sergey Yelin)
Date: Tue, 15 Mar 2011 10:20:26 +0300
Subject: Error in beam compiler (R14B01 on Widows XP)
Message-ID: <AANLkTi=JGE_UtC5pB+OyzY6-CAKCP3HMwVEEikxchtrq@mail.gmail.com>

Hi list,

I've found beam compiler error in R14B01.
Here is a simple program that works fine on R14B (erts-5.8.1.1) but fail in
R14B01 on the same environment (Windows XP SP3):

-module(problem7).
-export([find/2]).

find(0, _) ->
   0;
find(Nth, Max) ->
   lists:nth(Nth, findp(Max, [], lists:seq(2, Max))).

findp(_, _, []) ->
    [];
findp(Max, P, [X | T]) when X*X =< Max ->
    P1 = [X] ++ P,
    findp(Max, P1, [N || N <- T, N rem X =/= 0]);
findp(_, P, L) ->
    P ++ L.

Output for R14B01 (in werl.exe):

Erlang R14B01 (erts-5.8.2) [smp:4:4] [rq:4] [async-threads:0]

Eshell V5.8.2  (abort with ^G)
1> c(problem7).
./problem7.erl:none: internal error in beam_asm;
crash reason: {undef,
                  [{beam_asm,module,
                       [{problem7,
                            [{find,2},{module_info,0},{module_info,1}],
                            [],
                            [{function,find,2,2,
                                 [{label,1},
                                  {func_info,{atom,problem7},{atom,find},2},
                                  {label,2},

 {test,is_eq_exact,{f,3},[{x,0},{integer,0}]},
                                  return,
                                  {label,3},
                                  {allocate,2,2},
                                  {move,{x,0},{y,1}},
                                  {move,{integer,2},{x,0}},
                                  {move,{x,1},{y,0}},
                                  {call_ext,2,{extfunc,lists,seq,2}},
                                  {move,nil,{x,1}},
                                  {move,{x,0},{x,2}},
                                  {move,{y,0},{x,0}},
                                  {trim,1,1},
                                  {call,3,{f,5}},
                                  {move,{x,0},{x,1}},
                                  {move,{y,0},{x,0}},

 {call_ext_last,2,{extfunc,lists,nth,2},1}]},
                             {function,findp,3,5,
                                 [{label,4},

 {func_info,{atom,problem7},{atom,findp},3},
                                  {label,5},
                                  {test,is_nonempty_list,{f,6},[{x,2}]},
                                  {get_list,{x,2},{x,3},{x,4}},
                                  {gc_bif,'*',{f,7},5,[{x,3},{x,3}],{x,5}},
                                  {test,is_ge,{f,7},[{x,0},{x,5}]},
                                  {allocate_heap,2,2,5},
                                  {move,{x,0},{y,1}},
                                  {put_list,{x,3},{x,1},{y,0}},
                                  {move,{x,3},{x,1}},
                                  {move,{x,4},{x,0}},
                                  {call,2,{f,13}},
                                  {move,{y,0},{x,1}},
                                  {move,{x,0},{x,2}},
                                  {move,{y,1},{x,0}},
                                  {call_last,3,{f,5},2},
                                  {label,6},
                                  {test,is_nil,{f,7},[{x,2}]},
                                  {move,nil,{x,0}},
                                  return,
                                  {label,7},
                                  {move,{x,1},{x,0}},
                                  {move,{x,2},{x,1}},

 {call_ext_only,2,{extfunc,erlang,'++',2}}]},
                             {function,module_info,0,9,
                                 [{label,8},
                                  {func_info,
                                      {atom,problem7},
                                      {atom,module_info},
                                      0},
                                  {label,9},
                                  {move,{atom,problem7},{x,0}},
                                  {call_ext_only,1,
                                      {extfunc,erlang,get_module_info,1}}]},
                             {function,module_info,1,11,
                                 [{label,10},
                                  {func_info,
                                      {atom,problem7},
                                      {atom,module_info},
                                      1},
                                  {label,11},
                                  {move,{x,0},{x,1}},
                                  {move,{atom,problem7},{x,0}},
                                  {call_ext_only,2,
                                      {extfunc,erlang,get_module_info,2}}]},
                             {function,'-findp/3-lc$^0/1-0-',2,13,
                                 [{label,12},
                                  {func_info,
                                      {atom,problem7},
                                      {atom,'-findp/3-lc$^0/1-0-'},
                                      2},
                                  {label,13},
                                  {test,is_nonempty_list,{f,15},[{x,0}]},
                                  {get_list,{x,0},{x,2},{x,3}},

 {gc_bif,'rem',{f,14},4,[{x,2},{x,1}],{x,4}},
                                  {test,is_ne_exact,
                                      {f,14},
                                      [{x,4},{integer,0}]},
                                  {allocate,1,4},
                                  {move,{x,3},{x,0}},
                                  {move,{x,2},{y,0}},
                                  {call,2,{f,13}},
                                  {test_heap,2,1},
                                  {put_list,{y,0},{x,0},{x,0}},
                                  {deallocate,1},
                                  return,
                                  {label,14},
                                  {move,{x,3},{x,0}},
                                  {call_only,2,{f,13}},
                                  {label,15},
                                  {test,is_nil,{f,12},[{x,0}]},
                                  return]}],
                            16},
                        [],"z:/Projects/myeuler/problem7.erl",[]]},
                   {compile,beam_asm,1},
                   {compile,'-internal_comp/4-anonymous-1-',2},
                   {compile,fold_comp,3},
                   {compile,internal_comp,4},
                   {compile,internal,3}]}
error
2>

From bgustavsson@REDACTED  Tue Mar 15 10:08:14 2011
From: bgustavsson@REDACTED (=?UTF-8?Q?Bj=C3=B6rn_Gustavsson?=)
Date: Tue, 15 Mar 2011 10:08:14 +0100
Subject: [erlang-bugs] Error in beam compiler (R14B01 on Widows XP)
In-Reply-To: <AANLkTi=JGE_UtC5pB+OyzY6-CAKCP3HMwVEEikxchtrq@mail.gmail.com>
References: <AANLkTi=JGE_UtC5pB+OyzY6-CAKCP3HMwVEEikxchtrq@mail.gmail.com>
Message-ID: <AANLkTin3Ya+LpYFpBwdXAu1y68zHjX9o=QQir=NAsTjc@mail.gmail.com>

On Tue, Mar 15, 2011 at 8:20 AM, Sergey Yelin <elinsn@REDACTED> wrote:
> Hi list,
>
> I've found beam compiler error in R14B01.

I think you have some problems in your installation
or environment.

> Eshell V5.8.2 ?(abort with ^G)
> 1> c(problem7).
> ./problem7.erl:none: internal error in beam_asm;
> crash reason: {undef,
> ? ? ? ? ? ? ? ? ?[{beam_asm,module,

This error message indicates that the beam_asm:module/4
function is undefined, either because the beam_asm
module for some reason is missing or that you have
your own version of the beam_asm module without
a module/4 function.

Can you compile any Erlang module?

-- 
Bj?rn Gustavsson, Erlang/OTP, Ericsson AB

From xramtsov@REDACTED  Tue Mar 15 15:22:16 2011
From: xramtsov@REDACTED (Evgeniy Khramtsov)
Date: Tue, 15 Mar 2011 23:22:16 +0900
Subject: send_timeout doesn't work
Message-ID: <4D7F7618.4040305@gmail.com>

It seems like there is a bug in send_timeout option of a TCP socket: the 
timeout is completely ignored (at least in active-once mode).
The code to reproduce: http://kuku.jabber.ru/~xram/lock.erl
Just compile it and start lock:listen() in one shell and lock:send() in 
another: over a time you will see that the receiving process is locked 
in prim_inet:send/3 and doesn't process current message in the mailbox. 
You can also play with PORT and SEND_TIMEOUT macros if needed.

Versions tested: R13B02 and R14B01 (on Debian 2.6.32-5-amd64 SMP).

-- 
Regards,
Evgeniy Khramtsov, ProcessOne.
xmpp:xram@REDACTED


From per.melin@REDACTED  Tue Mar 15 23:29:51 2011
From: per.melin@REDACTED (Per Melin)
Date: Tue, 15 Mar 2011 23:29:51 +0100
Subject: reltool's app_file option
Message-ID: <AANLkTi=7Krqy5tdSbpsCwbyczqgQqLAnY3VF_wB0zeTT@mail.gmail.com>

The documentation lists 'keep', 'strip' and 'all' as valid values, but
only 'keep' is allowed. The others give you an exit with "Illegal
option: {app_file,all}".

The following line in reltool_server.erl needs Val to be both 'strip'
and 'all' simultaneously:

app_file when Val =:= keep; Val =:= strip, Val =:= all ->

In 0.5.3 (R13B04) and the dev branch.

From hm@REDACTED  Wed Mar 16 16:14:13 2011
From: hm@REDACTED (=?ISO-8859-1?Q?H=E5kan_Mattsson?=)
Date: Wed, 16 Mar 2011 16:14:13 +0100
Subject: Broken Hipe in R14B02
Message-ID: <AANLkTinLAZ92O8hYQhdUnUtPsXOWNAr4S5SW3SiQ8zh0@mail.gmail.com>

I forgot to use --disable-hipe and got the following compilation error.

Time to disable hipe by default?

/H?kan

$ uname -a
Linux tellus 2.6.35-27-generic #48-Ubuntu SMP Tue Feb 22 20:25:46 UTC
2011 x86_64 GNU/Linux
$ ./configure --prefix=/usr/local/pgm/otp_R14B02
--enable-halfword-emulator && make
...
...
...
gcc  -g -O3 -I/usr/local/src/otp_src_R14B02/erts/x86_64-unknown-linux-gnu
  -fno-tree-copyrename  -D_GNU_SOURCE -DERTS_SMP -DHAVE_CONFIG_H -Wall
-Wstrict-prototypes -Wmissing-prototypes -Wdeclarati
on-after-statement -DUSE_THREADS -D_THREAD_SAFE -D_REENTRANT
-DPOSIX_THREADS -D_POSIX_THREAD_SAFE_FUNCTIONS
-Ix86_64-unknown-linux-gnu/opt/smp -Ibeam -Isys/unix -Isys/common
-Ix86_64-unknown-linux
-gnu -Izlib  -Ipcre -Ihipe -I../include
-I../include/x86_64-unknown-linux-gnu -I../include/internal
-I../include/internal/x86_64-unknown-linux-gnu -c hipe/hipe_amd64.c -o
obj/x86_64-unknown-linux-gnu/opt/smp/hipe_amd64.o
hipe/hipe_amd64.c:40: warning: large integer implicitly truncated to
unsigned type
hipe/hipe_amd64.c:42: error: conflicting types for ?hipe_patch_load_fe?
hipe/hipe_arch.h:28: note: previous declaration of ?hipe_patch_load_fe? was here
hipe/hipe_amd64.c:49: error: conflicting types for ?hipe_patch_insn?
hipe/hipe_arch.h:29: note: previous declaration of ?hipe_patch_insn? was here
hipe/hipe_amd64.c: In function ?hipe_patch_call?:
hipe/hipe_amd64.c:77: warning: cast from pointer to integer of different size
hipe/hipe_amd64.c:77: warning: cast from pointer to integer of different size
hipe/hipe_amd64.c: In function ?hipe_bifs_write_u64_2?:
hipe/hipe_amd64.c:371: warning: passing argument 2 of ?term_to_Uint?
from incompatible pointer type
beam/big.h:152: note: expected ?Uint *? but argument is of type ?Uint64 *?
make[3]: *** [obj/x86_64-unknown-linux-gnu/opt/smp/hipe_amd64.o] Error 1
make[3]: Leaving directory `/usr/local/src/otp_src_R14B02/erts/emulator'
make[2]: *** [opt] Error 2
make[2]: Leaving directory `/usr/local/src/otp_src_R14B02/erts/emulator'
:

From sverker@REDACTED  Wed Mar 16 16:31:06 2011
From: sverker@REDACTED (Sverker Eriksson)
Date: Wed, 16 Mar 2011 16:31:06 +0100
Subject: [erlang-bugs] Broken Hipe in R14B02
In-Reply-To: <AANLkTinLAZ92O8hYQhdUnUtPsXOWNAr4S5SW3SiQ8zh0@mail.gmail.com>
References: <AANLkTinLAZ92O8hYQhdUnUtPsXOWNAr4S5SW3SiQ8zh0@mail.gmail.com>
Message-ID: <4D80D7BA.4080502@erix.ericsson.se>

Hipe and halfword emulator do not play nice together yet.

Any problems without --enable-halfword-emulator?

/Sverker, Erlang/OTP


H?kan Mattsson wrote:
> I forgot to use --disable-hipe and got the following compilation error.
>
> Time to disable hipe by default?
>
> /H?kan
>
> $ uname -a
> Linux tellus 2.6.35-27-generic #48-Ubuntu SMP Tue Feb 22 20:25:46 UTC
> 2011 x86_64 GNU/Linux
> $ ./configure --prefix=/usr/local/pgm/otp_R14B02
> --enable-halfword-emulator && make
> ...
> ...
> ...
> gcc  -g -O3 -I/usr/local/src/otp_src_R14B02/erts/x86_64-unknown-linux-gnu
>   -fno-tree-copyrename  -D_GNU_SOURCE -DERTS_SMP -DHAVE_CONFIG_H -Wall
> -Wstrict-prototypes -Wmissing-prototypes -Wdeclarati
> on-after-statement -DUSE_THREADS -D_THREAD_SAFE -D_REENTRANT
> -DPOSIX_THREADS -D_POSIX_THREAD_SAFE_FUNCTIONS
> -Ix86_64-unknown-linux-gnu/opt/smp -Ibeam -Isys/unix -Isys/common
> -Ix86_64-unknown-linux
> -gnu -Izlib  -Ipcre -Ihipe -I../include
> -I../include/x86_64-unknown-linux-gnu -I../include/internal
> -I../include/internal/x86_64-unknown-linux-gnu -c hipe/hipe_amd64.c -o
> obj/x86_64-unknown-linux-gnu/opt/smp/hipe_amd64.o
> hipe/hipe_amd64.c:40: warning: large integer implicitly truncated to
> unsigned type
> hipe/hipe_amd64.c:42: error: conflicting types for ?hipe_patch_load_fe?
> hipe/hipe_arch.h:28: note: previous declaration of ?hipe_patch_load_fe? was here
> hipe/hipe_amd64.c:49: error: conflicting types for ?hipe_patch_insn?
> hipe/hipe_arch.h:29: note: previous declaration of ?hipe_patch_insn? was here
> hipe/hipe_amd64.c: In function ?hipe_patch_call?:
> hipe/hipe_amd64.c:77: warning: cast from pointer to integer of different size
> hipe/hipe_amd64.c:77: warning: cast from pointer to integer of different size
> hipe/hipe_amd64.c: In function ?hipe_bifs_write_u64_2?:
> hipe/hipe_amd64.c:371: warning: passing argument 2 of ?term_to_Uint?
> from incompatible pointer type
> beam/big.h:152: note: expected ?Uint *? but argument is of type ?Uint64 *?
> make[3]: *** [obj/x86_64-unknown-linux-gnu/opt/smp/hipe_amd64.o] Error 1
> make[3]: Leaving directory `/usr/local/src/otp_src_R14B02/erts/emulator'
> make[2]: *** [opt] Error 2
> make[2]: Leaving directory `/usr/local/src/otp_src_R14B02/erts/emulator'
> :
>
> ________________________________________________________________
> erlang-bugs (at) erlang.org mailing list.
> See http://www.erlang.org/faq.html
> To unsubscribe; mailto:erlang-bugs-unsubscribe@REDACTED
>
>
>   


From boris.muehmer@REDACTED  Wed Mar 16 16:51:00 2011
From: boris.muehmer@REDACTED (=?UTF-8?Q?Boris_M=C3=BChmer?=)
Date: Wed, 16 Mar 2011 16:51:00 +0100
Subject: R14B02: "make install-docs" fails on Ubuntu 10.04 / 10.10 (using the
 source tar-ball)
Message-ID: <AANLkTincySyZmV7+F+1WP1brHveiXuwL=APV1CiAk4eT@mail.gmail.com>

"make install-docs" fails on Ubuntu 10.04 / 10.10 systems using the
R14B02 source tar-ball (like in "R14B01").

The simple fix for "$ERL_TOP/Makefile.in" (or "$ERL_TOP/Makefile")
would be to change line 412 from
        $(ERL_TOP)/lib/erl_docgen/priv/bin/xref_mod_app.escript
-topdir $(ERL_TOP) -outfile $(ERL_TOP)/make/$(TARGET)/mod2app.xml
to
        $(ERL_TOP)/bin/escript
$(ERL_TOP)/lib/erl_docgen/priv/bin/xref_mod_app.escript -topdir
$(ERL_TOP) -outfile $(ERL_TOP)/make/$(TARGET)/mod2app.xml


  - boris

From andrew@REDACTED  Wed Mar 16 19:15:56 2011
From: andrew@REDACTED (Andrew Thompson)
Date: Wed, 16 Mar 2011 14:15:56 -0400
Subject: [erlang-bugs] R14B02: "make install-docs" fails on Ubuntu
 10.04 / 10.10 (using the source tar-ball)
In-Reply-To: <AANLkTincySyZmV7+F+1WP1brHveiXuwL=APV1CiAk4eT@mail.gmail.com>
References: <AANLkTincySyZmV7+F+1WP1brHveiXuwL=APV1CiAk4eT@mail.gmail.com>
Message-ID: <20110316181555.GL6177@hijacked.us>

On Wed, Mar 16, 2011 at 04:51:00PM +0100, Boris M??hmer wrote:
> "make install-docs" fails on Ubuntu 10.04 / 10.10 systems using the
> R14B02 source tar-ball (like in "R14B01").
> 
> The simple fix for "$ERL_TOP/Makefile.in" (or "$ERL_TOP/Makefile")
> would be to change line 412 from
>         $(ERL_TOP)/lib/erl_docgen/priv/bin/xref_mod_app.escript
> -topdir $(ERL_TOP) -outfile $(ERL_TOP)/make/$(TARGET)/mod2app.xml
> to
>         $(ERL_TOP)/bin/escript
> $(ERL_TOP)/lib/erl_docgen/priv/bin/xref_mod_app.escript -topdir
> $(ERL_TOP) -outfile $(ERL_TOP)/make/$(TARGET)/mod2app.xml
> 

Came here to report this. Boris' fix seems to solve the issue but then I
get this error:

=== Entering application common_test
make  RELEASE_PATH=/usr/local/lib/erlang   release_docs_spec 
escript
/Users/andrew/otp_src_R14B02/lib/erl_docgen/priv/bin/xml_from_edoc.escript
-preprocess true -i /include \
		-i ../../../test_server/include -i  ../../include \
		-i ../../../../erts/lib/kernel/include -i
../../../../lib/kernel/include \
		-i ../../../../erts/lib/snmp/include -i ../../../../lib/snmp/include
../../src/ct.erl
escript: exception error: undefined function edoc:file/2
  in function  erl_eval:local_func/5
  in call from escript:interpret/4
  in call from escript:start/1
  in call from init:start_it/1
  in call from init:start_em/1
make[5]: *** [ct.xml] Error 127
make[4]: *** [release_docs] Error 2
make[3]: *** [release_docs] Error 2
make[2]: *** [release_docs] Error 2
make[1]: *** [release_docs] Error 2
make: *** [install-docs] Error 2

Looks like a similar problem, except that this escript is being invoked
via 'escript' Changing 'escript' in the Makefile to '$(OTP_TOP)/escript'
seems to fix it, but its wrong in a bunch of the doc Makefiles.

Andrew

From andrew@REDACTED  Wed Mar 16 19:21:48 2011
From: andrew@REDACTED (Andrew Thompson)
Date: Wed, 16 Mar 2011 14:21:48 -0400
Subject: [erlang-bugs] R14B02: "make install-docs" fails on Ubuntu
 10.04 / 10.10 (using the source tar-ball)
In-Reply-To: <20110316181555.GL6177@hijacked.us>
References: <AANLkTincySyZmV7+F+1WP1brHveiXuwL=APV1CiAk4eT@mail.gmail.com>
 <20110316181555.GL6177@hijacked.us>
Message-ID: <20110316182147.GM6177@hijacked.us>

On Wed, Mar 16, 2011 at 02:15:56PM -0400, Andrew Thompson wrote:
> Looks like a similar problem, except that this escript is being invoked
> via 'escript' Changing 'escript' in the Makefile to '$(OTP_TOP)/escript'
> seems to fix it, but its wrong in a bunch of the doc Makefiles.
>

That should be $(ERL_TOP)/bin/escript, obviously.

Andrew

From lukas@REDACTED  Thu Mar 17 10:38:17 2011
From: lukas@REDACTED (Lukas Larsson)
Date: Thu, 17 Mar 2011 10:38:17 +0100
Subject: [erlang-bugs] R14B02: "make install-docs" fails on Ubuntu
 10.04 / 10.10 (using the source tar-ball)
In-Reply-To: <20110316181555.GL6177@hijacked.us>
References: <AANLkTincySyZmV7+F+1WP1brHveiXuwL=APV1CiAk4eT@mail.gmail.com>
	 <20110316181555.GL6177@hijacked.us>
Message-ID: <1300354697.2336.37.camel@bilbo>

Hi Boris and Andrew!

Thanks for pointing this out. It has to do with the fact that an old
version of escript (and in extension the erlang VM) is used to build the
docs. A workaround for now is to add /your/r14b02/path/bin/ into your
PATH and then build the docs. 

I'm however unsure if the patch you provided will work for us as it
might not always be true that one would want to use the
$ERL_TOP/bin/escript emulator to build the docs. I'll try to come up
with a solutions which works for both scenarios. 

Lukas

On Wed, 2011-03-16 at 14:15 -0400, Andrew Thompson wrote:
> On Wed, Mar 16, 2011 at 04:51:00PM +0100, Boris M??hmer wrote:
> > "make install-docs" fails on Ubuntu 10.04 / 10.10 systems using the
> > R14B02 source tar-ball (like in "R14B01").
> > 
> > The simple fix for "$ERL_TOP/Makefile.in" (or "$ERL_TOP/Makefile")
> > would be to change line 412 from
> >         $(ERL_TOP)/lib/erl_docgen/priv/bin/xref_mod_app.escript
> > -topdir $(ERL_TOP) -outfile $(ERL_TOP)/make/$(TARGET)/mod2app.xml
> > to
> >         $(ERL_TOP)/bin/escript
> > $(ERL_TOP)/lib/erl_docgen/priv/bin/xref_mod_app.escript -topdir
> > $(ERL_TOP) -outfile $(ERL_TOP)/make/$(TARGET)/mod2app.xml
> > 
> 
> Came here to report this. Boris' fix seems to solve the issue but then I
> get this error:
> 
> === Entering application common_test
> make  RELEASE_PATH=/usr/local/lib/erlang   release_docs_spec 
> escript
> /Users/andrew/otp_src_R14B02/lib/erl_docgen/priv/bin/xml_from_edoc.escript
> -preprocess true -i /include \
> 		-i ../../../test_server/include -i  ../../include \
> 		-i ../../../../erts/lib/kernel/include -i
> ../../../../lib/kernel/include \
> 		-i ../../../../erts/lib/snmp/include -i ../../../../lib/snmp/include
> ../../src/ct.erl
> escript: exception error: undefined function edoc:file/2
>   in function  erl_eval:local_func/5
>   in call from escript:interpret/4
>   in call from escript:start/1
>   in call from init:start_it/1
>   in call from init:start_em/1
> make[5]: *** [ct.xml] Error 127
> make[4]: *** [release_docs] Error 2
> make[3]: *** [release_docs] Error 2
> make[2]: *** [release_docs] Error 2
> make[1]: *** [release_docs] Error 2
> make: *** [install-docs] Error 2
> 
> Looks like a similar problem, except that this escript is being invoked
> via 'escript' Changing 'escript' in the Makefile to '$(OTP_TOP)/escript'
> seems to fix it, but its wrong in a bunch of the doc Makefiles.
> 
> Andrew
> 
> ________________________________________________________________
> erlang-bugs (at) erlang.org mailing list.
> See http://www.erlang.org/faq.html
> To unsubscribe; mailto:erlang-bugs-unsubscribe@REDACTED


From boris.muehmer@REDACTED  Thu Mar 17 11:04:08 2011
From: boris.muehmer@REDACTED (=?UTF-8?Q?Boris_M=C3=BChmer?=)
Date: Thu, 17 Mar 2011 11:04:08 +0100
Subject: [erlang-bugs] R14B02: "make install-docs" fails on Ubuntu 10.04 /
 10.10 (using the source tar-ball)
In-Reply-To: <1300354697.2336.37.camel@bilbo>
References: <AANLkTincySyZmV7+F+1WP1brHveiXuwL=APV1CiAk4eT@mail.gmail.com>
	<20110316181555.GL6177@hijacked.us>
	<1300354697.2336.37.camel@bilbo>
Message-ID: <AANLkTikMKSMS2PzG1FPg3D2c6_u8-NQuM=85epkPu2Ds@mail.gmail.com>

The "funny" thing was (this was also true when I wrote about it concerning
the R14B01 release), that adding the path wasn't enough.

My normal procedure for installing from source is:

    1> tar xvf <TARBALL>
    2> cd <SRCDIR>
    3> export LANG=C
    4> export ERL_TOP="`pwd`"
    5> export PATH="$ERL_TOP/bin:$PATH"
    6> ./configure --prefix=<DESTDIR>
    7> ( make all && make install && make docs && make install-docs )
2>&1 | tee log-build.txt

With step 5 the "right" escript should be in the path, but without patching the
makefile/makefie.in "make install-docs" does fail on "my" systems.

Currently I don't understand why "env" fails to locate the right "escript"
from the PATH.

Besides: there is neither an Erlang installation from the Ubuntu repositories
on my systems, nor is another Erlang installation bin-directory in my PATH.


  - boris


2011/3/17 Lukas Larsson <lukas@REDACTED>:
> Hi Boris and Andrew!
>
> Thanks for pointing this out. It has to do with the fact that an old
> version of escript (and in extension the erlang VM) is used to build the
> docs. A workaround for now is to add /your/r14b02/path/bin/ into your
> PATH and then build the docs.
>
> I'm however unsure if the patch you provided will work for us as it
> might not always be true that one would want to use the
> $ERL_TOP/bin/escript emulator to build the docs. I'll try to come up
> with a solutions which works for both scenarios.
>
> Lukas
>
> On Wed, 2011-03-16 at 14:15 -0400, Andrew Thompson wrote:
>> On Wed, Mar 16, 2011 at 04:51:00PM +0100, Boris M??hmer wrote:
>> > "make install-docs" fails on Ubuntu 10.04 / 10.10 systems using the
>> > R14B02 source tar-ball (like in "R14B01").
>> >
>> > The simple fix for "$ERL_TOP/Makefile.in" (or "$ERL_TOP/Makefile")
>> > would be to change line 412 from
>> > ? ? ? ? $(ERL_TOP)/lib/erl_docgen/priv/bin/xref_mod_app.escript
>> > -topdir $(ERL_TOP) -outfile $(ERL_TOP)/make/$(TARGET)/mod2app.xml
>> > to
>> > ? ? ? ? $(ERL_TOP)/bin/escript
>> > $(ERL_TOP)/lib/erl_docgen/priv/bin/xref_mod_app.escript -topdir
>> > $(ERL_TOP) -outfile $(ERL_TOP)/make/$(TARGET)/mod2app.xml
>> >
>>
>> Came here to report this. Boris' fix seems to solve the issue but then I
>> get this error:
>>
>> === Entering application common_test
>> make ?RELEASE_PATH=/usr/local/lib/erlang ? release_docs_spec
>> escript
>> /Users/andrew/otp_src_R14B02/lib/erl_docgen/priv/bin/xml_from_edoc.escript
>> -preprocess true -i /include \
>> ? ? ? ? ? ? ? -i ../../../test_server/include -i ?../../include \
>> ? ? ? ? ? ? ? -i ../../../../erts/lib/kernel/include -i
>> ../../../../lib/kernel/include \
>> ? ? ? ? ? ? ? -i ../../../../erts/lib/snmp/include -i ../../../../lib/snmp/include
>> ../../src/ct.erl
>> escript: exception error: undefined function edoc:file/2
>> ? in function ?erl_eval:local_func/5
>> ? in call from escript:interpret/4
>> ? in call from escript:start/1
>> ? in call from init:start_it/1
>> ? in call from init:start_em/1
>> make[5]: *** [ct.xml] Error 127
>> make[4]: *** [release_docs] Error 2
>> make[3]: *** [release_docs] Error 2
>> make[2]: *** [release_docs] Error 2
>> make[1]: *** [release_docs] Error 2
>> make: *** [install-docs] Error 2
>>
>> Looks like a similar problem, except that this escript is being invoked
>> via 'escript' Changing 'escript' in the Makefile to '$(OTP_TOP)/escript'
>> seems to fix it, but its wrong in a bunch of the doc Makefiles.
>>
>> Andrew
>>
>> ________________________________________________________________
>> erlang-bugs (at) erlang.org mailing list.
>> See http://www.erlang.org/faq.html
>> To unsubscribe; mailto:erlang-bugs-unsubscribe@REDACTED
>
>
>

From lukas@REDACTED  Thu Mar 17 11:15:49 2011
From: lukas@REDACTED (Lukas Larsson)
Date: Thu, 17 Mar 2011 11:15:49 +0100
Subject: [erlang-bugs] R14B02: "make install-docs" fails on Ubuntu
 10.04 / 10.10 (using the source tar-ball)
In-Reply-To: <AANLkTikMKSMS2PzG1FPg3D2c6_u8-NQuM=85epkPu2Ds@mail.gmail.com>
References: <AANLkTincySyZmV7+F+1WP1brHveiXuwL=APV1CiAk4eT@mail.gmail.com>
	 <20110316181555.GL6177@hijacked.us> <1300354697.2336.37.camel@bilbo>
	 <AANLkTikMKSMS2PzG1FPg3D2c6_u8-NQuM=85epkPu2Ds@mail.gmail.com>
Message-ID: <1300356950.2336.57.camel@bilbo>

Ah, ok. Then the workaround only works for Andrews problem, but not for
yours. Adding escript (without the $(ERL_TOP)/bin) before
$(ERL_TOP)/lib/erl_docgen/priv/bin/xref_mod_app.escript -topdir
$(ERL_TOP) -outfile $(ERL_TOP)/make/$(TARGET)/mod2app.xml should do the
trick as long as you have the latest version of erlang in your path
while building. 

That will probably have to be good enough for now, unless someone comes
up with a solution which addresses the problem with configuring which vm
to use.

Lukas

On Thu, 2011-03-17 at 11:04 +0100, Boris M?hmer wrote:
> The "funny" thing was (this was also true when I wrote about it concerning
> the R14B01 release), that adding the path wasn't enough.
> 
> My normal procedure for installing from source is:
> 
>     1> tar xvf <TARBALL>
>     2> cd <SRCDIR>
>     3> export LANG=C
>     4> export ERL_TOP="`pwd`"
>     5> export PATH="$ERL_TOP/bin:$PATH"
>     6> ./configure --prefix=<DESTDIR>
>     7> ( make all && make install && make docs && make install-docs )
> 2>&1 | tee log-build.txt
> 
> With step 5 the "right" escript should be in the path, but without patching the
> makefile/makefie.in "make install-docs" does fail on "my" systems.
> 
> Currently I don't understand why "env" fails to locate the right "escript"
> from the PATH.
> 
> Besides: there is neither an Erlang installation from the Ubuntu repositories
> on my systems, nor is another Erlang installation bin-directory in my PATH.
> 
> 
>   - boris
> 
> 
> 2011/3/17 Lukas Larsson <lukas@REDACTED>:
> > Hi Boris and Andrew!
> >
> > Thanks for pointing this out. It has to do with the fact that an old
> > version of escript (and in extension the erlang VM) is used to build the
> > docs. A workaround for now is to add /your/r14b02/path/bin/ into your
> > PATH and then build the docs.
> >
> > I'm however unsure if the patch you provided will work for us as it
> > might not always be true that one would want to use the
> > $ERL_TOP/bin/escript emulator to build the docs. I'll try to come up
> > with a solutions which works for both scenarios.
> >
> > Lukas
> >
> > On Wed, 2011-03-16 at 14:15 -0400, Andrew Thompson wrote:
> >> On Wed, Mar 16, 2011 at 04:51:00PM +0100, Boris M??hmer wrote:
> >> > "make install-docs" fails on Ubuntu 10.04 / 10.10 systems using the
> >> > R14B02 source tar-ball (like in "R14B01").
> >> >
> >> > The simple fix for "$ERL_TOP/Makefile.in" (or "$ERL_TOP/Makefile")
> >> > would be to change line 412 from
> >> >         $(ERL_TOP)/lib/erl_docgen/priv/bin/xref_mod_app.escript
> >> > -topdir $(ERL_TOP) -outfile $(ERL_TOP)/make/$(TARGET)/mod2app.xml
> >> > to
> >> >         $(ERL_TOP)/bin/escript
> >> > $(ERL_TOP)/lib/erl_docgen/priv/bin/xref_mod_app.escript -topdir
> >> > $(ERL_TOP) -outfile $(ERL_TOP)/make/$(TARGET)/mod2app.xml
> >> >
> >>
> >> Came here to report this. Boris' fix seems to solve the issue but then I
> >> get this error:
> >>
> >> === Entering application common_test
> >> make  RELEASE_PATH=/usr/local/lib/erlang   release_docs_spec
> >> escript
> >> /Users/andrew/otp_src_R14B02/lib/erl_docgen/priv/bin/xml_from_edoc.escript
> >> -preprocess true -i /include \
> >>               -i ../../../test_server/include -i  ../../include \
> >>               -i ../../../../erts/lib/kernel/include -i
> >> ../../../../lib/kernel/include \
> >>               -i ../../../../erts/lib/snmp/include -i ../../../../lib/snmp/include
> >> ../../src/ct.erl
> >> escript: exception error: undefined function edoc:file/2
> >>   in function  erl_eval:local_func/5
> >>   in call from escript:interpret/4
> >>   in call from escript:start/1
> >>   in call from init:start_it/1
> >>   in call from init:start_em/1
> >> make[5]: *** [ct.xml] Error 127
> >> make[4]: *** [release_docs] Error 2
> >> make[3]: *** [release_docs] Error 2
> >> make[2]: *** [release_docs] Error 2
> >> make[1]: *** [release_docs] Error 2
> >> make: *** [install-docs] Error 2
> >>
> >> Looks like a similar problem, except that this escript is being invoked
> >> via 'escript' Changing 'escript' in the Makefile to '$(OTP_TOP)/escript'
> >> seems to fix it, but its wrong in a bunch of the doc Makefiles.
> >>
> >> Andrew
> >>
> >> ________________________________________________________________
> >> erlang-bugs (at) erlang.org mailing list.
> >> See http://www.erlang.org/faq.html
> >> To unsubscribe; mailto:erlang-bugs-unsubscribe@REDACTED
> >
> >
> >
> 
> ________________________________________________________________
> erlang-bugs (at) erlang.org mailing list.
> See http://www.erlang.org/faq.html
> To unsubscribe; mailto:erlang-bugs-unsubscribe@REDACTED
> 


From mikpe@REDACTED  Thu Mar 17 11:43:36 2011
From: mikpe@REDACTED (Mikael Pettersson)
Date: Thu, 17 Mar 2011 11:43:36 +0100
Subject: [erlang-bugs] Broken Hipe in R14B02
In-Reply-To: <4D80D7BA.4080502@erix.ericsson.se>
References: <AANLkTinLAZ92O8hYQhdUnUtPsXOWNAr4S5SW3SiQ8zh0@mail.gmail.com>
	<4D80D7BA.4080502@erix.ericsson.se>
Message-ID: <19841.58840.701872.705972@pilspetsen.it.uu.se>

On Wed, 16 Mar 2011 16:31:06 +0100, Sverker Eriksson wrote:
> Hipe and halfword emulator do not play nice together yet.
> 
> Any problems without --enable-halfword-emulator?
> 
> /Sverker, Erlang/OTP
> 
> 
> H=E5kan Mattsson wrote:
> > I forgot to use --disable-hipe and got the following compilation error.=
> 
> >
> > Time to disable hipe by default?
> >
> > /H=E5kan
> >
> > $ uname -a
> > Linux tellus 2.6.35-27-generic #48-Ubuntu SMP Tue Feb 22 20:25:46 UTC
> > 2011 x86_64 GNU/Linux
> > $ ./configure --prefix=3D/usr/local/pgm/otp_R14B02
> > --enable-halfword-emulator && make
> > ...
> > ...
> > ...
> > gcc  -g -O3 -I/usr/local/src/otp_src_R14B02/erts/x86_64-unknown-linux-g=
> nu
> >   -fno-tree-copyrename  -D_GNU_SOURCE -DERTS_SMP -DHAVE_CONFIG_H -Wall
> > -Wstrict-prototypes -Wmissing-prototypes -Wdeclarati
> > on-after-statement -DUSE_THREADS -D_THREAD_SAFE -D_REENTRANT
> > -DPOSIX_THREADS -D_POSIX_THREAD_SAFE_FUNCTIONS
> > -Ix86_64-unknown-linux-gnu/opt/smp -Ibeam -Isys/unix -Isys/common
> > -Ix86_64-unknown-linux
> > -gnu -Izlib  -Ipcre -Ihipe -I../include
> > -I../include/x86_64-unknown-linux-gnu -I../include/internal
> > -I../include/internal/x86_64-unknown-linux-gnu -c hipe/hipe_amd64.c -o
> > obj/x86_64-unknown-linux-gnu/opt/smp/hipe_amd64.o
> > hipe/hipe_amd64.c:40: warning: large integer implicitly truncated to
> > unsigned type
> > hipe/hipe_amd64.c:42: error: conflicting types for =91hipe_patch_load_f=
> e=92
> > hipe/hipe_arch.h:28: note: previous declaration of =91hipe_patch_load_f=
> e=92 was here
> > hipe/hipe_amd64.c:49: error: conflicting types for =91hipe_patch_insn=92=
> 
> > hipe/hipe_arch.h:29: note: previous declaration of =91hipe_patch_insn=92=
>  was here
> > hipe/hipe_amd64.c: In function =91hipe_patch_call=92:
> > hipe/hipe_amd64.c:77: warning: cast from pointer to integer of differen=
> t size
> > hipe/hipe_amd64.c:77: warning: cast from pointer to integer of differen=
> t size
> > hipe/hipe_amd64.c: In function =91hipe_bifs_write_u64_2=92:
> > hipe/hipe_amd64.c:371: warning: passing argument 2 of =91term_to_Uint=92=
> 
> > from incompatible pointer type
> > beam/big.h:152: note: expected =91Uint *=92 but argument is of type =91=
> Uint64 *=92
> > make[3]: *** [obj/x86_64-unknown-linux-gnu/opt/smp/hipe_amd64.o] Error =
> 1
> > make[3]: Leaving directory `/usr/local/src/otp_src_R14B02/erts/emulator=
> '
> > make[2]: *** [opt] Error 2
> > make[2]: Leaving directory `/usr/local/src/otp_src_R14B02/erts/emulator=
> '
> > :

The halfword emulator is a new beast with an execution mode that
differs significantly from existing 32- and 64-bit modes.  The build
error in the runtime code you got is just the tip of the iceberg,
non-trivial changes to the compiler would be required to support the
halfword emulator.

For now the best solution is to auto-disable HiPE if halfword emulator
is enabled, and error out if both are explicitly enabled.

(If someone wants to fund the development of HiPE support for
halfword emulator on Linux/AMD64, contact me offline.)

/Mikael

From sverker@REDACTED  Thu Mar 17 12:13:29 2011
From: sverker@REDACTED (Sverker Eriksson)
Date: Thu, 17 Mar 2011 12:13:29 +0100
Subject: [erlang-bugs] Broken Hipe in R14B02
In-Reply-To: <19841.58840.701872.705972@pilspetsen.it.uu.se>
References: <AANLkTinLAZ92O8hYQhdUnUtPsXOWNAr4S5SW3SiQ8zh0@mail.gmail.com>	<4D80D7BA.4080502@erix.ericsson.se> <19841.58840.701872.705972@pilspetsen.it.uu.se>
Message-ID: <4D81ECD9.6040103@erix.ericsson.se>

Mikael Pettersson wrote:
> For now the best solution is to auto-disable HiPE if halfword emulator
> is enabled, and error out if both are explicitly enabled.
>   
Agree. It will turn up in dev branch for next release.

/Sverker, Erlang/OTP


From eric.pailleau@REDACTED  Thu Mar 17 22:25:07 2011
From: eric.pailleau@REDACTED (PAILLEAU Eric)
Date: Thu, 17 Mar 2011 22:25:07 +0100
Subject: [erlang-bugs] wx undefined symbol
In-Reply-To: <4D6D5807.4020408@pailleau.org>
References: <4D5AF2D8.4070105@wanadoo.fr> <4D5AF612.3010701@wanadoo.fr> <4D6D5807.4020408@pailleau.org>
Message-ID: <4D827C33.8060404@wanadoo.fr>

Hi,

I tried with the last R14B02, and WX is working without changing
anything else (?!).
I do not see in the readme file what could have solve my problem, but
anyway, it works now.

I just got some annoying outputs in the erl shell :
  (Erlang:12555): Gtk-WARNING **: gtk_widget_size_allocate(): attempt
to allocate widget with width -5 and height 17

I got this by playing with wx:demo().

Thanks to Dan Gudmundsson for his help, even if I did not have the time
to try his tips.

regards.


From pan@REDACTED  Fri Mar 18 10:49:16 2011
From: pan@REDACTED (pan@REDACTED)
Date: Fri, 18 Mar 2011 10:49:16 +0100
Subject: [erlang-bugs] segmentation fault in tree_delete at
 beam/erl_bestfit_alloc.c:431
In-Reply-To: <871v29a0ze.fsf@goryachev.org>
References: <871v29a0ze.fsf@goryachev.org>
Message-ID: <Pine.LNX.4.64.1103181027080.10609@arwen.otp.ericsson.se>

Hi Igor!

Sadly enough, this is the worst kind of core you could ever have :(

The core is generated in the allocators, but that's most probably not the 
allocators fault. Something has written outside of an allocated area 
earlier and now the error shows up in some (possibly/probaly) unrelated 
place.

First of all, I have to ask if you have some non-OTP drivers or NIF's 
loaded in the VM? Have you loaded some native code not supplied in the 
Erlang distribution? In that case, try to rule out errors in that code and 
in libraries loaded by that code by e.g. disabling it in some way (write 
slower erlang-replacements etc).

Next question is if you use some drivers or NIF's provided by us that pull 
third party libraries, like Wx oc Crypto (by using SSL etc). If we could 
isolate the problem to a driver (our's or your's) the searchspace would be 
greatly reduced.

Also, looking at the core locally would possibly help me to identify the 
type of data that has been written into the block, which possibly could 
narrow it down, so if you could tar your compiled build tree and the core 
and put it on something where I can fetch it (mail me personally with the 
details, if you can do that), that would be helpful.

If the workload is low, running the VM under Valgrind, would probably be 
feasible. There is a special valgrind target when doing make in the 
$ERL_TOP/erts/emulator directory, you can do 'make FLAVOR=smp valgrind' if 
you have valgrind 3.4 or higher installed on the system. Running cerl 
-valgrind (from the $ERL_TOP/bin directory) would then start erlang in the 
valgrind virtual environment, which should point out any illegal memory 
accesses (note that some warnings are expected, namely a lot of 
PossiblyLost, which is due to us keeping pointers *into* structures 
instead of to the beginning of the structures).

Another possibility is to compile all C code with -D_FORTIFY_SOURCE, which 
may find faulty memory accesses too.

You say this is frequent. Is it in any way manually reproducable? Have you 
got any idea of which erlang-code is run when this happens (i.e. during 
some special kind of workload)? One possibility is that this is a compiler 
error (in our compiler that is), so a module triggering the proble m would 
also be interesting.

Please make sure to run R14B02 and recompile all erlang code with the 
latest Erlang version to rule out any bug that's already corrected :)

Sorry for the big fluffy list of options, but as I said, this is a kind of 
error that is really hard to track down...

Cheers,
/Patrik

On Mon, 14 Mar 2011, Igor Goryachev wrote:

> Hello.
>
> We are suffering of quite frequent segmentation faults on our erlangish
> environment. We run r14b01 node with a very small load on linux 2.6.32
> (Debian GNU/Linux Squeeze 6.0), which is virtual machine hosted under
> OpenVZ hypervisor (16 cores, Xeon 2.40GHz).
>
> I've tried to rebuild erlang with and without smp and threads, but in any
> case I'm getting the same behaviour.
>
> What additional helpful information should I provide?
>
>
> Core was generated by `/usr/lib/erlang/erts-5.8.2/bin/beam -K true -- -root /usr/lib/erlang -progname'.
> Program terminated with signal 11, Segmentation fault.
> #0  0x0000000000437e83 in tree_delete (allctr=0x7cbf20, del=0x7f93a8267460) at beam/erl_bestfit_alloc.c:431
> 431     beam/erl_bestfit_alloc.c: No such file or directory.
>        in beam/erl_bestfit_alloc.c
> (gdb) where
> #0  0x0000000000437e83 in tree_delete (allctr=0x7cbf20, del=0x7f93a8267460) at beam/erl_bestfit_alloc.c:431
> #1  0x0000000000438bb2 in bf_unlink_free_block (allctr=0x7cbf20, size=<value optimized out>, cand_blk=<value optimized out>,
>    cand_size=0) at beam/erl_bestfit_alloc.c:791
> #2  bf_get_free_block (allctr=0x7cbf20, size=<value optimized out>, cand_blk=<value optimized out>, cand_size=0)
>    at beam/erl_bestfit_alloc.c:842
> #3  0x0000000000433506 in mbc_alloc_block (allctr=0x7cbf20, size=287) at beam/erl_alloc_util.c:631
> #4  mbc_alloc (allctr=0x7cbf20, size=287) at beam/erl_alloc_util.c:764
> #5  0x00000000004b8118 in erts_alloc (c_p=0x7f93a70a90e0, reg=<value optimized out>, live=<value optimized out>,
>    build_size_term=<value optimized out>, extra_words=140272158101112, unit=8) at beam/erl_alloc.h:184
> #6  erts_bin_nrml_alloc (c_p=0x7f93a70a90e0, reg=<value optimized out>, live=<value optimized out>,
>    build_size_term=<value optimized out>, extra_words=140272158101112, unit=8) at beam/erl_binary.h:253
> #7  erts_bs_append (c_p=0x7f93a70a90e0, reg=<value optimized out>, live=<value optimized out>, build_size_term=<value optimized out>,
>    extra_words=140272158101112, unit=8) at beam/erl_bits.c:1325
> #8  0x00000000004e0a02 in process_main () at beam/beam_emu.c:3624
> #9  0x000000000043c5eb in erl_start (argc=33, argv=<value optimized out>) at beam/erl_init.c:1443
> #10 0x0000000000427ac9 in main (argc=8175392, argv=0x7f93a8267460) at sys/unix/erl_main.c:29
>
>
> -- 
> Igor Goryachev
>
> ________________________________________________________________
> erlang-bugs (at) erlang.org mailing list.
> See http://www.erlang.org/faq.html
> To unsubscribe; mailto:erlang-bugs-unsubscribe@REDACTED
>

From kwidoyo@REDACTED  Fri Mar 18 11:03:43 2011
From: kwidoyo@REDACTED (Kustarto Widoyo)
Date: Fri, 18 Mar 2011 19:03:43 +0900
Subject: [erlang-bugs] Segmentation fault in erts_printf_char (R13B04)
In-Reply-To: <Pine.LNX.4.64.1103081357430.10609@arwen.otp.ericsson.se>
References: <4D4BA2F3.2080405@geminimobile.com> <Pine.LNX.4.64.1103011146180.10609@arwen.otp.ericsson.se> <4D75E41C.1080502@geminimobile.com> <Pine.LNX.4.64.1103081357430.10609@arwen.otp.ericsson.se>
Message-ID: <4D832DFF.8000008@geminimobile.com>

Patrik did investigation for this issue, and as fyi, the following was 
his comment.

 > The problem encountered is that the internal printf gets a really deep
 > structure to format for the call erlag:system_info(procs), called from a
 > fun declared in gmt_cinfo_basic:erlang_system_info. The call formats a
 > binary with debug information for every process in this huge system, of
 > which one has this reeeeally deep list structure.
 >
 > I shall limit the depth of the output in the erts_printf call, but you
 > should really not call erlang:system_info(procs) in such a huge system.
 > Using that kind of debug functionality in the system will cost *a lot*
 > in terms of memory and CPU. So a workaround for this would be to disable
 > this dumping of process debug information in your system.

Thank you very much Patrik.

Widoyo

From pan@REDACTED  Fri Mar 18 11:24:33 2011
From: pan@REDACTED (pan@REDACTED)
Date: Fri, 18 Mar 2011 11:24:33 +0100
Subject: [erlang-bugs] Segmentation fault in erts_printf_char (R13B04)
In-Reply-To: <4D832DFF.8000008@geminimobile.com>
References: <4D4BA2F3.2080405@geminimobile.com>
 <Pine.LNX.4.64.1103011146180.10609@arwen.otp.ericsson.se>
 <4D75E41C.1080502@geminimobile.com> <Pine.LNX.4.64.1103081357430.10609@arwen.otp.ericsson.se>
 <4D832DFF.8000008@geminimobile.com>
Message-ID: <Pine.LNX.4.64.1103181119290.10609@arwen.otp.ericsson.se>

Hi!

A correction to erts_printf, that makes it not recurse on the C stack any 
more, is on it's way. Expect to see it in the GitHub dev branch in a few 
days.

On Fri, 18 Mar 2011, Kustarto Widoyo wrote:

> Patrik did investigation for this issue, and as fyi, the following was his 
> comment.
>
>> The problem encountered is that the internal printf gets a really deep
>> structure to format for the call erlag:system_info(procs), called from a
>> fun declared in gmt_cinfo_basic:erlang_system_info. The call formats a
>> binary with debug information for every process in this huge system, of
>> which one has this reeeeally deep list structure.
>>
>> I shall limit the depth of the output in the erts_printf call, but you
>> should really not call erlang:system_info(procs) in such a huge system.
>> Using that kind of debug functionality in the system will cost *a lot*
>> in terms of memory and CPU. So a workaround for this would be to disable
>> this dumping of process debug information in your system.
>
> Thank you very much Patrik.

Thank you for the help in tracking this down!
>
> Widoyo

Cheers,
/Patrik
>
> ________________________________________________________________
> erlang-bugs (at) erlang.org mailing list.
> See http://www.erlang.org/faq.html
> To unsubscribe; mailto:erlang-bugs-unsubscribe@REDACTED
>

From xramtsov@REDACTED  Mon Mar 21 07:34:39 2011
From: xramtsov@REDACTED (Evgeniy Khramtsov)
Date: Mon, 21 Mar 2011 15:34:39 +0900
Subject: send_timeout doesn't work
In-Reply-To: <4D7F7618.4040305@gmail.com>
References: <4D7F7618.4040305@gmail.com>
Message-ID: <4D86F17F.6060702@gmail.com>

15.03.2011 23:22, Evgeniy Khramtsov wrote:
> It seems like there is a bug in send_timeout option of a TCP socket: 
> the timeout is completely ignored (at least in active-once mode).
> The code to reproduce: http://kuku.jabber.ru/~xram/lock.erl
> Just compile it and start lock:listen() in one shell and lock:send() 
> in another: over a time you will see that the receiving process is 
> locked in prim_inet:send/3 and doesn't process current message in the 
> mailbox. You can also play with PORT and SEND_TIMEOUT macros if needed.
>
> Versions tested: R13B02 and R14B01 (on Debian 2.6.32-5-amd64 SMP).
>

Any response on this? Has anyone been able to reproduce the problem?

-- 
Regards,
Evgeniy Khramtsov, ProcessOne.
xmpp:xram@REDACTED


From spawn.think@REDACTED  Mon Mar 21 10:59:41 2011
From: spawn.think@REDACTED (Ahmed Omar)
Date: Mon, 21 Mar 2011 10:59:41 +0100
Subject: [erlang-bugs] Re: send_timeout doesn't work
In-Reply-To: <4D86F17F.6060702@gmail.com>
References: <4D7F7618.4040305@gmail.com>
	<4D86F17F.6060702@gmail.com>
Message-ID: <AANLkTimFYj+YSY1W5Guso=Q4gr2Eqb4SPzBT+qc008SQ@mail.gmail.com>

http://erlang.2086793.n4.nabble.com/tcp-connection-with-timeout-td2090360.html

On Mon, Mar 21, 2011 at 7:34 AM, Evgeniy Khramtsov <xramtsov@REDACTED>wrote:

> 15.03.2011 23:22, Evgeniy Khramtsov wrote:
>
>> It seems like there is a bug in send_timeout option of a TCP socket: the
>> timeout is completely ignored (at least in active-once mode).
>> The code to reproduce: http://kuku.jabber.ru/~xram/lock.erl
>> Just compile it and start lock:listen() in one shell and lock:send() in
>> another: over a time you will see that the receiving process is locked in
>> prim_inet:send/3 and doesn't process current message in the mailbox. You can
>> also play with PORT and SEND_TIMEOUT macros if needed.
>>
>> Versions tested: R13B02 and R14B01 (on Debian 2.6.32-5-amd64 SMP).
>>
>>
> Any response on this? Has anyone been able to reproduce the problem?
>
>
> --
> Regards,
> Evgeniy Khramtsov, ProcessOne.
> xmpp:xram@REDACTED
>
>
> ________________________________________________________________
> erlang-bugs (at) erlang.org mailing list.
> See http://www.erlang.org/faq.html
> To unsubscribe; mailto:erlang-bugs-unsubscribe@REDACTED
>
>


-- 
Best Regards,
- Ahmed Omar
http://nl.linkedin.com/in/adiaa
Follow me on twitter
@spawn_think <http://twitter.com/#!/spawn_think>

From xramtsov@REDACTED  Mon Mar 21 11:18:49 2011
From: xramtsov@REDACTED (Evgeniy Khramtsov)
Date: Mon, 21 Mar 2011 19:18:49 +0900
Subject: [erlang-bugs] Re: send_timeout doesn't work
In-Reply-To: <AANLkTimFYj+YSY1W5Guso=Q4gr2Eqb4SPzBT+qc008SQ@mail.gmail.com>
References: <4D7F7618.4040305@gmail.com>	<4D86F17F.6060702@gmail.com> <AANLkTimFYj+YSY1W5Guso=Q4gr2Eqb4SPzBT+qc008SQ@mail.gmail.com>
Message-ID: <4D872609.6060006@gmail.com>

21.03.2011 18:59, Ahmed Omar wrote:
> http://erlang.2086793.n4.nabble.com/tcp-connection-with-timeout-td2090360.html
>    

So what? How does that relate to the fact that send_timeout never works?

-- 
Regards,
Evgeniy Khramtsov, ProcessOne.
xmpp:xram@REDACTED


From pan@REDACTED  Mon Mar 21 15:07:30 2011
From: pan@REDACTED (pan@REDACTED)
Date: Mon, 21 Mar 2011 15:07:30 +0100
Subject: [erlang-bugs] send_timeout doesn't work
In-Reply-To: <4D7F7618.4040305@gmail.com>
References: <4D7F7618.4040305@gmail.com>
Message-ID: <Pine.LNX.4.64.1103211503231.10609@arwen.otp.ericsson.se>

Hi!

Very good test program, it's obviously something wrong here, and that's 
the handling of timeouts when we are in active mode. It's broken.

Please try the attached (very simple) patch, it should fix the problem. 
It's still not tested in our daily builds, but it will soon be. Any 
feedback is welcome!

Cheers,
/Patrik

On Tue, 15 Mar 2011, Evgeniy Khramtsov wrote:

> It seems like there is a bug in send_timeout option of a TCP socket: the 
> timeout is completely ignored (at least in active-once mode).
> The code to reproduce: http://kuku.jabber.ru/~xram/lock.erl
> Just compile it and start lock:listen() in one shell and lock:send() in 
> another: over a time you will see that the receiving process is locked in 
> prim_inet:send/3 and doesn't process current message in the mailbox. You can 
> also play with PORT and SEND_TIMEOUT macros if needed.
>
> Versions tested: R13B02 and R14B01 (on Debian 2.6.32-5-amd64 SMP).
>
> -- 
> Regards,
> Evgeniy Khramtsov, ProcessOne.
> xmpp:xram@REDACTED
>
>
> ________________________________________________________________
> erlang-bugs (at) erlang.org mailing list.
> See http://www.erlang.org/faq.html
> To unsubscribe; mailto:erlang-bugs-unsubscribe@REDACTED
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: tcp_send_timeout.diff
Type: text/x-patch
Size: 587 bytes
Desc: fix for gen_tcp send timeouts
URL: <http://erlang.org/pipermail/erlang-bugs/attachments/20110321/fc597ce2/attachment.bin>

From igor@REDACTED  Mon Mar 21 19:46:40 2011
From: igor@REDACTED (Igor Goryachev)
Date: Mon, 21 Mar 2011 21:46:40 +0300
Subject: [erlang-bugs] segmentation fault in tree_delete at beam/erl_bestfit_alloc.c:431
In-Reply-To: <Pine.LNX.4.64.1103181027080.10609@arwen.otp.ericsson.se>
	(pan@erlang.org's message of "Fri, 18 Mar 2011 10:49:16 +0100")
References: <871v29a0ze.fsf@goryachev.org>
	<Pine.LNX.4.64.1103181027080.10609@arwen.otp.ericsson.se>
Message-ID: <87hbawwbu7.fsf@goryachev.org>

Hi, Patrik.

On Fri, Mar 18, 2011 at 12:49, somebody wrote:

> First of all, I have to ask if you have some non-OTP drivers or NIF's
> loaded in the VM? Have you loaded some native code not supplied in the
> Erlang distribution? In that case, try to rule out errors in that code
> and in libraries loaded by that code by e.g. disabling it in some way
> (write slower erlang-replacements etc).

We have pair of nodes per machine which are sort of frontend/backend
and are speaking with each other using standard erlangish rpc. Frontend
node (the one which segfaults) uses exmpp library by ProcessOne. Other
third parties libraries (and my own code) do not contain non-OTP linked-in
drivers and NIF's.

> Next question is if you use some drivers or NIF's provided by us that
> pull third party libraries, like Wx oc Crypto (by using SSL etc). If
> we could isolate the problem to a driver (our's or your's) the
> searchspace would be greatly reduced.

We have no encryption here, but crypto application is loaded only for
sha/1 usage. No wx, etc... 

> Also, looking at the core locally would possibly help me to identify
> the type of data that has been written into the block, which possibly
> could narrow it down, so if you could tar your compiled build tree and
> the core and put it on something where I can fetch it (mail me
> personally with the details, if you can do that), that would be
> helpful.

Ok, I will prepare a tarball as soon as possible and send you a link.

> You say this is frequent. Is it in any way manually reproducable? Have
> you got any idea of which erlang-code is run when this happens
> (i.e. during some special kind of workload)? One possibility is that
> this is a compiler error (in our compiler that is), so a module
> triggering the proble m would also be interesting.

It occurs two-three times during a day time. For now I have no idea how
it could be reproduced manually.

> Please make sure to run R14B02 and recompile all erlang code with the
> latest Erlang version to rule out any bug that's already corrected :)

Yes, I have already installed R14B02, but behaviour is the same.

> Sorry for the big fluffy list of options, but as I said, this is a
> kind of error that is really hard to track down...

Thank you very much for your answer. I hope we resolve this issue. :-)


-- 
Igor Goryachev

From raimo+erlang-bugs@REDACTED  Tue Mar 22 14:04:02 2011
From: raimo+erlang-bugs@REDACTED (Raimo Niskanen)
Date: Tue, 22 Mar 2011 14:04:02 +0100
Subject: Mailing list software change
Message-ID: <20110322130402.GC12691@erix.ericsson.se>

Hi all.

We will change servers and mailing list software from the old-fashion
ezmlm to GNU Mailman on Thu Mar 24 afternoon (CET). All subscribtions
will be transferred into the corresponding Mailman settings.

The Mailman web interface is probably not up and running at first,
we'll see about that until later.

-- 

/ Raimo Niskanen, Erlang/OTP, Ericsson AB

From den@REDACTED  Thu Mar 24 14:11:47 2011
From: den@REDACTED (Denis Afonin)
Date: Thu, 24 Mar 2011 16:11:47 +0300
Subject: Orber application don`t depend to mnesia
Message-ID: <20110324161147.1726b6d8@shimbo>

Hi,

In embedded mode orber application attempt to start before mnesia, so
it`s crashing.

Erlang version: debian, 1:14.a-dfsg-3.

Regards,
Denis.

PS Here the patch:

diff -Naur erlang-14.a-dfsg/lib/orber/src/orber.app.src erlang-14.a-dfsg.1/lib/orber/src/orber.app.src
--- erlang-14.a-dfsg/lib/orber/src/orber.app.src	2011-03-24 13:07:00.221000018 +0000
+++ erlang-14.a-dfsg.1/lib/orber/src/orber.app.src	2011-03-24 13:06:14.045000018 +0000
@@ -101,7 +101,7 @@
 	        orber_iiop_insup, orber_init, orber_reqno,
 	        orber_objkeyserver, orber_iiop_socketsup, 
                 orber_iiop_pm, orber_env]},
-  {applications, [stdlib, kernel]},
+  {applications, [stdlib, kernel, mnesia]},
   {env, []},
   {mod, {orber, []}}
 ]}.

From mcbain@REDACTED  Thu Mar 24 14:25:33 2011
From: mcbain@REDACTED (Carlo Bertoldi)
Date: Thu, 24 Mar 2011 14:25:33 +0100
Subject: Time and system suspend
Message-ID: <AANLkTikcLgpTJZVuhYUdy25=ez3+kKSEoFVm8OLNkwv4@mail.gmail.com>

Hi, I think I've found a bug.
Version I'm using: Erlang R13B03 (erts-5.7.4) [source] [smp:2:2]
[rq:2] [async-threads:0] [hipe] [kernel-poll:false]
on Linux.

Steps to reproduce the problem:
open an Erlang shell,
calendar:now_to_local_time(erlang:now()).          It returns the correct time

Suspend the computer without closing the erlang shell.
Take a nap ;)
Wake up the computer.
calendar:now_to_local_time(erlang:now()).

Now I can tell when I went to sleep, because the time printed is the
time at the moment of the suspension, plus the time
passed since the wake up. Please note that the system clock is fine.
To double check, I quit the erl shell, than fired it up again, and
then the time displayed was correct.

Regards,
 Carlo Bertoldi

-- 
? molto pi? bello sapere qualcosa di tutto, che sapere tutto di una cosa.

Blaise Pascal

From hm@REDACTED  Thu Mar 24 17:13:32 2011
From: hm@REDACTED (=?ISO-8859-1?Q?H=E5kan_Mattsson?=)
Date: Thu, 24 Mar 2011 17:13:32 +0100
Subject: [erlang-bugs 1] Re: [erlang-bugs] Orber application don`t depend to
	mnesia
In-Reply-To: <20110324161147.1726b6d8@shimbo>
References: <20110324161147.1726b6d8@shimbo>
Message-ID: <AANLkTinfiPMnHER-BRtA_OSi3FBjAD7uagnKm87vEx85@mail.gmail.com>

Orber can make use of Mnesia, but the usage is optional.

/H?kan

On Thu, Mar 24, 2011 at 2:11 PM, Denis Afonin <den@REDACTED> wrote:
> Hi,
>
> In embedded mode orber application attempt to start before mnesia, so
> it`s crashing.
>
> Erlang version: debian, 1:14.a-dfsg-3.
>
> Regards,
> Denis.
>
> PS Here the patch:
>
> diff -Naur erlang-14.a-dfsg/lib/orber/src/orber.app.src erlang-14.a-dfsg.1/lib/orber/src/orber.app.src
> --- erlang-14.a-dfsg/lib/orber/src/orber.app.src ? ? ? ?2011-03-24 13:07:00.221000018 +0000
> +++ erlang-14.a-dfsg.1/lib/orber/src/orber.app.src ? ? ?2011-03-24 13:06:14.045000018 +0000
> @@ -101,7 +101,7 @@
> ? ? ? ? ? ? ? ?orber_iiop_insup, orber_init, orber_reqno,
> ? ? ? ? ? ? ? ?orber_objkeyserver, orber_iiop_socketsup,
> ? ? ? ? ? ? ? ? orber_iiop_pm, orber_env]},
> - ?{applications, [stdlib, kernel]},
> + ?{applications, [stdlib, kernel, mnesia]},
> ? {env, []},
> ? {mod, {orber, []}}
> ?]}.
>
> ________________________________________________________________
> erlang-bugs (at) erlang.org mailing list.
> See http://www.erlang.org/faq.html
> To unsubscribe; mailto:erlang-bugs-unsubscribe@REDACTED
>
>


From attila.r.nohl@REDACTED  Thu Mar 24 17:49:19 2011
From: attila.r.nohl@REDACTED (Attila Rajmund Nohl)
Date: Thu, 24 Mar 2011 17:49:19 +0100
Subject: [erlang-bugs 2] Re: [erlang-bugs] Time and system suspend
In-Reply-To: <AANLkTikcLgpTJZVuhYUdy25=ez3+kKSEoFVm8OLNkwv4@mail.gmail.com>
References: <AANLkTikcLgpTJZVuhYUdy25=ez3+kKSEoFVm8OLNkwv4@mail.gmail.com>
Message-ID: <AANLkTinG9Y5rF=Bk__vP8yfmr7jzEs4PpsPFNq+5n3sB@mail.gmail.com>

2011/3/24, Carlo Bertoldi <mcbain@REDACTED>:
> Hi, I think I've found a bug.
> Version I'm using: Erlang R13B03 (erts-5.7.4) [source] [smp:2:2]
> [rq:2] [async-threads:0] [hipe] [kernel-poll:false]
> on Linux.
>
> Steps to reproduce the problem:
> open an Erlang shell,
> calendar:now_to_local_time(erlang:now()).          It returns the correct
> time
>
> Suspend the computer without closing the erlang shell.
> Take a nap ;)
> Wake up the computer.
> calendar:now_to_local_time(erlang:now()).
>
> Now I can tell when I went to sleep, because the time printed is the
> time at the moment of the suspension, plus the time
> passed since the wake up. Please note that the system clock is fine.
> To double check, I quit the erl shell, than fired it up again, and
> then the time displayed was correct.

erlang:now() does not return the current time (despite its
documentation), but a tuple that is guaranteed to continuously
increase for subsequent calls. Use the os:timestamp() to get the
current time.


From andrew@REDACTED  Thu Mar 24 18:35:13 2011
From: andrew@REDACTED (Andrew Thompson)
Date: Thu, 24 Mar 2011 13:35:13 -0400
Subject: [erlang-bugs 3] Bug in -spec/@doc ordering in new edoc
Message-ID: <20110324173513.GK20461@hijacked.us>

Hi, I just noticed an odd behaviour where the order in which a @doc and
a -spec appear affects whether the @doc appears in the edoc output. If I
put the -spec first, the only documentation for that function is the
spec, if I put the @doc first, they both show up.

Here's the workaround commit I had to make:

https://github.com/Vagabond/gen_smtp/commit/accfd881e92ae59946444987568217ba4bfa80c4

For some reason, the other two functions exported from that file work
fine with the 'spec before doc' style, just not this for this particular
function.

Andrew


From erlangsiri@REDACTED  Fri Mar 25 09:48:51 2011
From: erlangsiri@REDACTED (Siri Hansen)
Date: Fri, 25 Mar 2011 09:48:51 +0100
Subject: [erlang-bugs 4] reltool's app_file option
Message-ID: <AANLkTimspuhOUUqt3CeiTB7Py8wWZ_UcGuegyX+Sr8_u@mail.gmail.com>

This has been corrected and will be included in R14B03
Thanks for the contribution!
Regards
/siri

> The documentation lists 'keep', 'strip' and 'all' as valid values, but
> only 'keep' is allowed. The others give you an exit with "Illegal
> option: {app_file,all}".
>
> The following line in reltool_server.erl needs Val to be both 'strip'
> and 'all' simultaneously:
>
> app_file when Val =:= keep; Val =:= strip, Val =:= all ->
>
> In 0.5.3 (R13B04) and the dev branch.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-bugs/attachments/20110325/0dfa4ab1/attachment.htm>

From nick@REDACTED  Fri Mar 25 10:51:44 2011
From: nick@REDACTED (Niclas Eklund)
Date: Fri, 25 Mar 2011 10:51:44 +0100
Subject: [erlang-bugs 5] Re: [erlang-bugs] Orber application don`t depend to
	mnesia
In-Reply-To: <20110324161147.1726b6d8@shimbo>
References: <20110324161147.1726b6d8@shimbo>
Message-ID: <Pine.LNX.4.64.1103251046370.11502@mallor.otp.ericsson.se>


Hello!

Thank you for reporting this, but this has already been changed and 
released inte the latest version (R14B02/orber-3.6.20).

Best Regards,

Niclas @ Erlang/OTP

On Thu, 24 Mar 2011, Denis Afonin wrote:

> Hi,
>
> In embedded mode orber application attempt to start before mnesia, so
> it`s crashing.
>
> Erlang version: debian, 1:14.a-dfsg-3.
>
> Regards,
> Denis.
>
> PS Here the patch:
>
> diff -Naur erlang-14.a-dfsg/lib/orber/src/orber.app.src erlang-14.a-dfsg.1/lib/orber/src/orber.app.src
> --- erlang-14.a-dfsg/lib/orber/src/orber.app.src	2011-03-24 13:07:00.221000018 +0000
> +++ erlang-14.a-dfsg.1/lib/orber/src/orber.app.src	2011-03-24 13:06:14.045000018 +0000
> @@ -101,7 +101,7 @@
> 	        orber_iiop_insup, orber_init, orber_reqno,
> 	        orber_objkeyserver, orber_iiop_socketsup,
>                 orber_iiop_pm, orber_env]},
> -  {applications, [stdlib, kernel]},
> +  {applications, [stdlib, kernel, mnesia]},
>   {env, []},
>   {mod, {orber, []}}
> ]}.
>
> ________________________________________________________________
> erlang-bugs (at) erlang.org mailing list.
> See http://www.erlang.org/faq.html
> To unsubscribe; mailto:erlang-bugs-unsubscribe@REDACTED
>


From ulf.wiger@REDACTED  Fri Mar 25 11:42:27 2011
From: ulf.wiger@REDACTED (Ulf Wiger)
Date: Fri, 25 Mar 2011 11:42:27 +0100
Subject: [erlang-bugs 6] make release_tests fails if wxWidgets is not
	installed
Message-ID: <DC6E9861-D739-438D-A39E-02F27D24F633@erlang-solutions.com>


When building OTP on a (Mac) without wx installed, make works, but make release_tests fails, not just for wx, but for et and reltool as well.

Putting in SKIP files doesn't help, but renaming the 'test' directory in those apps does.

BR,
Ulf

Ulf Wiger, CTO, Erlang Solutions, Ltd.
http://erlang-solutions.com


From fdmanana@REDACTED  Sun Mar 27 19:07:03 2011
From: fdmanana@REDACTED (Filipe David Manana)
Date: Sun, 27 Mar 2011 18:07:03 +0100
Subject: [erlang-bugs 7] possible supervisor bug in r14b02
Message-ID: <AANLkTikL2Ucevd6-dPKdhibmo+KiqUeEbEhv9QqwcJM5@mail.gmail.com>

Hi,

In R14B02 I noticed that for a child with a "temporary" restart_type,
we discard its A component of the MFA tuple when adding the childspec
to the list of the supervisor's children [1].

When the child terminates, its spec is never removed from the list of
the supervisor's children specs.
Then if we call supervisor:restart_child/2 after the child terminates,
the handle_call clause for restart_child gets the childspec with an
MFA  that is {M, F, undefined} [2]. At that point do_start_child will
call apply(M, F, undefined) [3] which will cause the supervisor to
reply with an error, instead of returning {ok, Pid} as in previous
releases. An example for the returned error:

 {error,{'EXIT',{badarg,[{erlang,apply,[gen_server,start_link,undefined]},
                              {supervisor,do_start_child,2},
                              {supervisor,handle_call,3},
                              {gen_server,handle_msg,5},
                              {proc_lib,init_p_do_apply,3}]}}}

The patch at [4] fixes the issue for me. The particular code that is
no longer working in R14B02 but worked on all previous releases, is
from Apache CouchDB, see [5]

This issue was introuced by OTP-9064 (reading from the R14B02 release notes).
Was this intended behaviour? It doesn't make much sense for me to keep
a temporary childspec in the supervisor once the child terminates, so
I believe deleting it from the state is the right thing to do.

[1] - https://github.com/erlang/otp/blob/dev/lib/stdlib/src/supervisor.erl#L787
[2] - https://github.com/erlang/otp/blob/dev/lib/stdlib/src/supervisor.erl#L314
[3] - https://github.com/erlang/otp/blob/dev/lib/stdlib/src/supervisor.erl#L246
[4] - https://github.com/fdmanana/otp/commit/2697042aa9ebab2fcd208c93b7f454b25bc580d4
[5] - https://github.com/apache/couchdb/blob/trunk/src/couchdb/couch_replicator.erl#L119

-- 
Filipe David Manana,
fdmanana@REDACTED, fdmanana@REDACTED

"Reasonable men adapt themselves to the world.
?Unreasonable men adapt the world to themselves.
?That's why all progress depends on unreasonable men."


From ingela@REDACTED  Mon Mar 28 10:24:59 2011
From: ingela@REDACTED (Ingela Anderton Andin)
Date: Mon, 28 Mar 2011 10:24:59 +0200
Subject: [erlang-bugs 8] Re: possible supervisor bug in r14b02
In-Reply-To: <AANLkTikL2Ucevd6-dPKdhibmo+KiqUeEbEhv9QqwcJM5@mail.gmail.com>
References: <AANLkTikL2Ucevd6-dPKdhibmo+KiqUeEbEhv9QqwcJM5@mail.gmail.com>
Message-ID: <4D9045DB.5060509@erix.ericsson.se>

Hi!

We will take a look at your patch it sounds like it is the right thing 
to do.
Temporary processes should not be restarted so you should not have
to save their  init-arguments, although until the last release they 
where saved
so it was possible to restart them! Especially for simple_one_for_one 
supervisors
that may have lots of temporary processes memory consumption can
go sky high if you save them.

Regards  Ingela Erlang OTP team - Ericsson AB

Filipe David Manana wrote:
> Hi,
>
> In R14B02 I noticed that for a child with a "temporary" restart_type,
> we discard its A component of the MFA tuple when adding the childspec
> to the list of the supervisor's children [1].
>
> When the child terminates, its spec is never removed from the list of
> the supervisor's children specs.
> Then if we call supervisor:restart_child/2 after the child terminates,
> the handle_call clause for restart_child gets the childspec with an
> MFA  that is {M, F, undefined} [2]. At that point do_start_child will
> call apply(M, F, undefined) [3] which will cause the supervisor to
> reply with an error, instead of returning {ok, Pid} as in previous
> releases. An example for the returned error:
>
>  {error,{'EXIT',{badarg,[{erlang,apply,[gen_server,start_link,undefined]},
>                               {supervisor,do_start_child,2},
>                               {supervisor,handle_call,3},
>                               {gen_server,handle_msg,5},
>                               {proc_lib,init_p_do_apply,3}]}}}
>
> The patch at [4] fixes the issue for me. The particular code that is
> no longer working in R14B02 but worked on all previous releases, is
> from Apache CouchDB, see [5]
>
> This issue was introuced by OTP-9064 (reading from the R14B02 release notes).
> Was this intended behaviour? It doesn't make much sense for me to keep
> a temporary childspec in the supervisor once the child terminates, so
> I believe deleting it from the state is the right thing to do.
>
> [1] - https://github.com/erlang/otp/blob/dev/lib/stdlib/src/supervisor.erl#L787
> [2] - https://github.com/erlang/otp/blob/dev/lib/stdlib/src/supervisor.erl#L314
> [3] - https://github.com/erlang/otp/blob/dev/lib/stdlib/src/supervisor.erl#L246
> [4] - https://github.com/fdmanana/otp/commit/2697042aa9ebab2fcd208c93b7f454b25bc580d4
> [5] - https://github.com/apache/couchdb/blob/trunk/src/couchdb/couch_replicator.erl#L119
>
>   


From fdmanana@REDACTED  Mon Mar 28 12:16:11 2011
From: fdmanana@REDACTED (Filipe David Manana)
Date: Mon, 28 Mar 2011 11:16:11 +0100
Subject: [erlang-bugs 9] Re: possible supervisor bug in r14b02
In-Reply-To: <4D9045DB.5060509@erix.ericsson.se>
References: <AANLkTikL2Ucevd6-dPKdhibmo+KiqUeEbEhv9QqwcJM5@mail.gmail.com>
 <4D9045DB.5060509@erix.ericsson.se>
Message-ID: <AANLkTimTxmVbjGkfY9TxfSYfBxFpzCKq+R4bsVmsBQQm@mail.gmail.com>

Thanks :)

On Mon, Mar 28, 2011 at 9:24 AM, Ingela Anderton Andin
<ingela@REDACTED> wrote:
> Hi!
>
> We will take a look at your patch it sounds like it is the right thing to
> do.
> Temporary processes should not be restarted so you should not have
> to save their ?init-arguments, although until the last release they where
> saved
> so it was possible to restart them! Especially for simple_one_for_one
> supervisors
> that may have lots of temporary processes memory consumption can
> go sky high if you save them.
>
> Regards ?Ingela Erlang OTP team - Ericsson AB
>
> Filipe David Manana wrote:
>>
>> Hi,
>>
>> In R14B02 I noticed that for a child with a "temporary" restart_type,
>> we discard its A component of the MFA tuple when adding the childspec
>> to the list of the supervisor's children [1].
>>
>> When the child terminates, its spec is never removed from the list of
>> the supervisor's children specs.
>> Then if we call supervisor:restart_child/2 after the child terminates,
>> the handle_call clause for restart_child gets the childspec with an
>> MFA ?that is {M, F, undefined} [2]. At that point do_start_child will
>> call apply(M, F, undefined) [3] which will cause the supervisor to
>> reply with an error, instead of returning {ok, Pid} as in previous
>> releases. An example for the returned error:
>>
>> ?{error,{'EXIT',{badarg,[{erlang,apply,[gen_server,start_link,undefined]},
>> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?{supervisor,do_start_child,2},
>> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?{supervisor,handle_call,3},
>> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?{gen_server,handle_msg,5},
>> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?{proc_lib,init_p_do_apply,3}]}}}
>>
>> The patch at [4] fixes the issue for me. The particular code that is
>> no longer working in R14B02 but worked on all previous releases, is
>> from Apache CouchDB, see [5]
>>
>> This issue was introuced by OTP-9064 (reading from the R14B02 release
>> notes).
>> Was this intended behaviour? It doesn't make much sense for me to keep
>> a temporary childspec in the supervisor once the child terminates, so
>> I believe deleting it from the state is the right thing to do.
>>
>> [1] -
>> https://github.com/erlang/otp/blob/dev/lib/stdlib/src/supervisor.erl#L787
>> [2] -
>> https://github.com/erlang/otp/blob/dev/lib/stdlib/src/supervisor.erl#L314
>> [3] -
>> https://github.com/erlang/otp/blob/dev/lib/stdlib/src/supervisor.erl#L246
>> [4] -
>> https://github.com/fdmanana/otp/commit/2697042aa9ebab2fcd208c93b7f454b25bc580d4
>> [5] -
>> https://github.com/apache/couchdb/blob/trunk/src/couchdb/couch_replicator.erl#L119
>>
>>
>
> _______________________________________________
> erlang-bugs mailing list
> erlang-bugs@REDACTED
> http://erlang.org/mailman/listinfo/erlang-bugs
>


-- 
Filipe David Manana,
fdmanana@REDACTED, fdmanana@REDACTED

"Reasonable men adapt themselves to the world.
?Unreasonable men adapt the world to themselves.
?That's why all progress depends on unreasonable men."


From philippu@REDACTED  Mon Mar 28 12:38:57 2011
From: philippu@REDACTED (Philipp Unterbrunner)
Date: Mon, 28 Mar 2011 12:38:57 +0200
Subject: [erlang-bugs 10] Re: [erlang-bugs] Distributed node crashes
 silently when initially receiving a big chunk of messages from another node
In-Reply-To: <4D652468.7000404@inf.ethz.ch>
References: <4D652468.7000404@inf.ethz.ch>
Message-ID: <4D906541.2060506@inf.ethz.ch>

The bug persists in r14b02.

If I find time, I will make a small demo application so that others can
reproduce the bug.

Philipp

On 02/23/2011 04:14 PM, Philipp Unterbrunner wrote:
> Hello,
>
> I have run into a serious and very annoying bug.
>
> Affects (at least); R13B04, R14A, R14B, R14B01
> Platform: Ubuntu Linux 10.10, kernel 2.6.35-25-server (SMP)
>
> When a newly started distributed node receives a high number of messages from another node, the newly started node crashes silently. Nothing is printed to the console. No crash dump or core dump is produced.
>
> In trying to find a work-around, I found the following curious behavior:
>
> * The bug *only* occurs for distributed nodes (but regardless of whether the nodes run on the same machine).
> * Waiting a few seconds (or even longer) before sending the first message to the newly started node does *not* make a difference. The node will still crash when confronted with a large number of incoming messages later.
> * Speed matters. When doing a debug build, the bug appears less often then when doing a release build, especially when HiPE is enabled. However, I managed to cause the bug even in debug mode, and when OTP was not compiled with native libs. The bug is simply much less likely to be observed.
> * The number of messages sent *initially* matters most. Slowly "ramping up" the load is a work-around. Once a node is working at high throughput, it is OK to stop sending messages for an arbitrary period and at a later point send a big chunk of messages that would have killed the node if sent initially.
> * Timing matters. Running the receiver node with +T 7 or higher makes the problem disappear.
> * Setting the sender node's distribution buffer size to the minimum (+zdbbl 1) makes the problem appear less often.
>
> I have reproduced the bug in various applications. The behavior described above also makes it fairly obvious that the application is not at fault.
>
> Rather, it appears that the receiver node is unable to buffer incoming messages and crashes. Of particular interest here is the fact that "ramping up" the load is a work-around. I suspect a low-level race condition where the receiver node does not allocate sufficient buffer space in time and crashes.
>
> Given that the existing work-arounds are not desirable ("ramp up" requires changes to the application code, +T 7 and +zdbbl 1 decrease performance), and given that the bug now persists over multiple releases, I hope someone can soon look into it.
>
> Thank you,
>
> Philipp
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-bugs/attachments/20110328/18c7619f/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 262 bytes
Desc: OpenPGP digital signature
URL: <http://erlang.org/pipermail/erlang-bugs/attachments/20110328/18c7619f/attachment.bin>

From emile@REDACTED  Tue Mar 29 13:46:19 2011
From: emile@REDACTED (Emile Joubert)
Date: Tue, 29 Mar 2011 12:46:19 +0100
Subject: [erlang-bugs] crypto from windows service?
Message-ID: <4D91C68B.2070702@rabbitmq.com>

Hi,

I'm unable to start the crypto module in an Erlang VM installed as a 
Windows service, if that service has any stopaction or a debugtype 
specified.

Here are the steps to reproduce:

 > erlsrv add test -st halt() -sn test@REDACTED
 > erlsrv start test
 > werl.exe -remsh test@REDACTED -sname tmp
   (in the werl window)
1> crypto:start().

When left long enough this leads to

** Node 'test@REDACTED' not responding **
** Removing (timedout) connection **

Specifying a debugtype of new or reuse also leads to a timeout:

 > erlsrv add test -de new -sn test@REDACTED
 > erlsrv start test
 > werl.exe -remsh test@REDACTED -sname tmp
   (in the werl window)
1> crypto:start().

When the service is installed without a stopaction and without a 
debugtype specified then the crypto module works fine:

 > erlsrv add test -sn test
 > erlsrv start test
 > werl.exe -remsh test@REDACTED -sname tmp
   (in the werl window)
1> crypto:rand_bytes(1).
<<"?">>

I've observed this behaviour on version R14B01 and R14B02 on Windows XP 
32bit. Is this a known issue and is there a better workaround than not 
specifying stopaction or a debugtype ?


Regards

Emile


From kruber@REDACTED  Tue Mar 29 15:06:46 2011
From: kruber@REDACTED (Nico Kruber)
Date: Tue, 29 Mar 2011 15:06:46 +0200
Subject: [erlang-bugs] UTF8 string handling in different erlang:*** functions
Message-ID: <201103291506.57110.kruber@zib.de>

is it possible that UTF8 strings are not supported by both
erlang:md5/1 and
erlang:list_to_binary/1 (and possibly more?)

I'm getting a bad argument exception when running the following:

> erlang:md5("W?grain (W?gr??)").                            
** exception error: bad argument
     in function  erlang:md5/1
        called as 
erlang:md5([87,224,103,114,97,105,110,32,40,87,229,103,114,335,227,
                              41])

even simpler, one can call:
> erlang:md5([256]).
** exception error: bad argument
     in function  erlang:md5/1
        called as erlang:md5([256])


for characters larger than 255, this exception is thrown. same for 
erlang:list_to_binary/1.

Both state that the input should be an iodata() or iolist() which are defined 
as:

iodata() = iolist() | binary()
iolist() = [char() | binary() | iolist()]
%  a binary is allowed as the tail of the list

And according to
http://www.erlang.org/doc/reference_manual/typespec.html
a character is any valid integer between 0 and 16#10ffff and it should be this 
way since erlang strings are unicode strings.

If this is correct behaviour, then how do I hash a unicode string without 
using erlang:term_to_binary/1 (which is possibly costly and should be 
unnecessary).


Regards
Nico
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 198 bytes
Desc: This is a digitally signed message part.
URL: <http://erlang.org/pipermail/erlang-bugs/attachments/20110329/57a95056/attachment.bin>

From bob@REDACTED  Tue Mar 29 15:17:03 2011
From: bob@REDACTED (Bob Ippolito)
Date: Tue, 29 Mar 2011 09:17:03 -0400
Subject: [erlang-bugs] Re: UTF8 string handling in different erlang:***
	functions
In-Reply-To: <201103291506.57110.kruber@zib.de>
References: <201103291506.57110.kruber@zib.de>
Message-ID: <AANLkTik22seCeSn2kUj+NZBSzmteMrN-D+-o3z2QSx2A@mail.gmail.com>

On Tue, Mar 29, 2011 at 9:06 AM, Nico Kruber <kruber@REDACTED> wrote:
> is it possible that UTF8 strings are not supported by both
> erlang:md5/1 and
> erlang:list_to_binary/1 (and possibly more?)
>
> I'm getting a bad argument exception when running the following:
>
>> erlang:md5("W?grain (W?gr??)").
> ** exception error: bad argument
> ? ? in function ?erlang:md5/1
> ? ? ? ?called as
> erlang:md5([87,224,103,114,97,105,110,32,40,87,229,103,114,335,227,
> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?41])
>
> even simpler, one can call:
>> erlang:md5([256]).
> ** exception error: bad argument
> ? ? in function ?erlang:md5/1
> ? ? ? ?called as erlang:md5([256])
>
>
> for characters larger than 255, this exception is thrown. same for
> erlang:list_to_binary/1.
>
> Both state that the input should be an iodata() or iolist() which are defined
> as:
>
> iodata() = iolist() | binary()
> iolist() = [char() | binary() | iolist()]
> % ?a binary is allowed as the tail of the list
>
> And according to
> http://www.erlang.org/doc/reference_manual/typespec.html
> a character is any valid integer between 0 and 16#10ffff and it should be this
> way since erlang strings are unicode strings.
>
> If this is correct behaviour, then how do I hash a unicode string without
> using erlang:term_to_binary/1 (which is possibly costly and should be
> unnecessary).

What you have is not UTF8, because UTF8 is defined over bytes
(0..255). IIRC, the actual definition of iolist should be
maybe_improper_list(byte() | binary() | iolist(), binary()). Functions
like erlang:list_to_binary/1 and erlang:md5/1 also only make sense
over bytes.

You can convert a list of unicode code points (L) to UTF8 with
unicode:characters_to_binary(L, utf8).

-bob


From pan@REDACTED  Tue Mar 29 15:26:11 2011
From: pan@REDACTED (pan@REDACTED)
Date: Tue, 29 Mar 2011 15:26:11 +0200
Subject: [erlang-bugs] Re: [erlang-bugs 10] Re: Distributed node crashes
 silently when initially receiving a big chunk of messages from another node
In-Reply-To: <4D906541.2060506@inf.ethz.ch>
References: <4D652468.7000404@inf.ethz.ch> <4D906541.2060506@inf.ethz.ch>
Message-ID: <Pine.LNX.4.64.1103291522070.10609@arwen.otp.ericsson.se>

Hi!

This sounds really bad! A demo application that reproduces the bug would 
be really nice.

Have you tried to enable core dumps to see if the erlang node crashes with 
a segfault? I suppose there are no erl_crash.dump files left after the 
crash that I can look at either?

Any way to reproduce it would make it more easy to find!

Cheers,
/Patrik

On Mon, 28 Mar 2011, Philipp Unterbrunner wrote:

> The bug persists in r14b02.
>
> If I find time, I will make a small demo application so that others can
> reproduce the bug.
>
> Philipp
>
> On 02/23/2011 04:14 PM, Philipp Unterbrunner wrote:
>> Hello,
>>
>> I have run into a serious and very annoying bug.
>>
>> Affects (at least); R13B04, R14A, R14B, R14B01
>> Platform: Ubuntu Linux 10.10, kernel 2.6.35-25-server (SMP)
>>
>> When a newly started distributed node receives a high number of messages from another node, the newly started node crashes silently. Nothing is printed to the console. No crash dump or core dump is produced.
>>
>> In trying to find a work-around, I found the following curious behavior:
>>
>> * The bug *only* occurs for distributed nodes (but regardless of whether the nodes run on the same machine).
>> * Waiting a few seconds (or even longer) before sending the first message to the newly started node does *not* make a difference. The node will still crash when confronted with a large number of incoming messages later.
>> * Speed matters. When doing a debug build, the bug appears less often then when doing a release build, especially when HiPE is enabled. However, I managed to cause the bug even in debug mode, and when OTP was not compiled with native libs. The bug is simply much less likely to be observed.
>> * The number of messages sent *initially* matters most. Slowly "ramping up" the load is a work-around. Once a node is working at high throughput, it is OK to stop sending messages for an arbitrary period and at a later point send a big chunk of messages that would have killed the node if sent initially.
>> * Timing matters. Running the receiver node with +T 7 or higher makes the problem disappear.
>> * Setting the sender node's distribution buffer size to the minimum (+zdbbl 1) makes the problem appear less often.
>>
>> I have reproduced the bug in various applications. The behavior described above also makes it fairly obvious that the application is not at fault.
>>
>> Rather, it appears that the receiver node is unable to buffer incoming messages and crashes. Of particular interest here is the fact that "ramping up" the load is a work-around. I suspect a low-level race condition where the receiver node does not allocate sufficient buffer space in time and crashes.
>>
>> Given that the existing work-arounds are not desirable ("ramp up" requires changes to the application code, +T 7 and +zdbbl 1 decrease performance), and given that the bug now persists over multiple releases, I hope someone can soon look into it.
>>
>> Thank you,
>>
>> Philipp
>


From pan@REDACTED  Tue Mar 29 16:08:26 2011
From: pan@REDACTED (pan@REDACTED)
Date: Tue, 29 Mar 2011 16:08:26 +0200
Subject: [erlang-bugs] Re: crypto from windows service?
In-Reply-To: <4D91C68B.2070702@rabbitmq.com>
References: <4D91C68B.2070702@rabbitmq.com>
Message-ID: <Pine.LNX.4.64.1103291604540.10609@arwen.otp.ericsson.se>

Hi!

I am unable to reproduce the problem, but a wild guess would be that the 
openssl libraries (dll's) get messed up in some way by the small 
differences in process creation when you connect the stdout/stdin to a 
pipe. Have you tried updating openssl on the machine? What happens if you 
specify debugtype console? Does anything show up in the debug log or in 
the event viewer when the node crashes? Does the node really crash or is 
it only the connection that fails?

Cheers,
/Patrik

On Tue, 29 Mar 2011, Emile Joubert wrote:

> Hi,
>
> I'm unable to start the crypto module in an Erlang VM installed as a Windows 
> service, if that service has any stopaction or a debugtype specified.
>
> Here are the steps to reproduce:
>
>> erlsrv add test -st halt() -sn test@REDACTED
>> erlsrv start test
>> werl.exe -remsh test@REDACTED -sname tmp
>  (in the werl window)
> 1> crypto:start().
>
> When left long enough this leads to
>
> ** Node 'test@REDACTED' not responding **
> ** Removing (timedout) connection **
>
> Specifying a debugtype of new or reuse also leads to a timeout:
>
>> erlsrv add test -de new -sn test@REDACTED
>> erlsrv start test
>> werl.exe -remsh test@REDACTED -sname tmp
>  (in the werl window)
> 1> crypto:start().
>
> When the service is installed without a stopaction and without a debugtype 
> specified then the crypto module works fine:
>
>> erlsrv add test -sn test
>> erlsrv start test
>> werl.exe -remsh test@REDACTED -sname tmp
>  (in the werl window)
> 1> crypto:rand_bytes(1).
> <<"?">>
>
> I've observed this behaviour on version R14B01 and R14B02 on Windows XP 
> 32bit. Is this a known issue and is there a better workaround than not 
> specifying stopaction or a debugtype ?
>
>
> Regards
>
> Emile
>
>
> _______________________________________________
> erlang-bugs mailing list
> erlang-bugs@REDACTED
> http://erlang.org/mailman/listinfo/erlang-bugs
>

From kruber@REDACTED  Tue Mar 29 16:10:28 2011
From: kruber@REDACTED (Nico Kruber)
Date: Tue, 29 Mar 2011 16:10:28 +0200
Subject: [erlang-bugs] Re: UTF8 string handling in different erlang:***
	functions
In-Reply-To: <AANLkTik22seCeSn2kUj+NZBSzmteMrN-D+-o3z2QSx2A@mail.gmail.com>
References: <201103291506.57110.kruber@zib.de>
 <AANLkTik22seCeSn2kUj+NZBSzmteMrN-D+-o3z2QSx2A@mail.gmail.com>
Message-ID: <201103291610.35983.kruber@zib.de>

On Tuesday 29 March 2011 15:17:03 Bob Ippolito wrote:
> On Tue, Mar 29, 2011 at 9:06 AM, Nico Kruber <kruber@REDACTED> wrote:
> > is it possible that UTF8 strings are not supported by both
> > erlang:md5/1 and
> > erlang:list_to_binary/1 (and possibly more?)
> > 
> > I'm getting a bad argument exception when running the following:
> >> erlang:md5("W?grain (W?gr??)").
> > 
> > ** exception error: bad argument
> >     in function  erlang:md5/1
> >        called as
> > erlang:md5([87,224,103,114,97,105,110,32,40,87,229,103,114,335,227,
> >                              41])
> > 
> > even simpler, one can call:
> >> erlang:md5([256]).
> > 
> > ** exception error: bad argument
> >     in function  erlang:md5/1
> >        called as erlang:md5([256])
> > 
> > 
> > for characters larger than 255, this exception is thrown. same for
> > erlang:list_to_binary/1.
> > 
> > Both state that the input should be an iodata() or iolist() which are
> > defined as:
> > 
> > iodata() = iolist() | binary()
> > iolist() = [char() | binary() | iolist()]
> > %  a binary is allowed as the tail of the list
> > 
> > And according to
> > http://www.erlang.org/doc/reference_manual/typespec.html
> > a character is any valid integer between 0 and 16#10ffff and it should be
> > this way since erlang strings are unicode strings.
> > 
> > If this is correct behaviour, then how do I hash a unicode string without
> > using erlang:term_to_binary/1 (which is possibly costly and should be
> > unnecessary).
> 
> What you have is not UTF8, because UTF8 is defined over bytes
> (0..255).

oh, right - this was maybe misleading, I should have rather said "erlang 
string"

> IIRC, the actual definition of iolist should be
> maybe_improper_list(byte() | binary() | iolist(), binary()). Functions
> like erlang:list_to_binary/1 and erlang:md5/1 also only make sense
> over bytes.

ok, makes sense, although it is rather inconvenient not being able to hash 
strings :(

> You can convert a list of unicode code points (L) to UTF8 with
> unicode:characters_to_binary(L, utf8).

ok, thanks for the tip - FYI, I ran a simple benchmark executing 
unicode:characters_to_binary/1 and erlang:term_to_binary/1 a Million times 
with the same string which resulted in the following:

> 1000000 iterations of "erlang:term_to_binary/1" took 0.02946s: 
33944331.2966734541/s
> 1000000 iterations of "unicode:characters_to_binary/1" took 0.667519s: 
1498084.69871269591/s

-> looks like I should chose erlang:term_to_binary/1 since at least on my 
machine is is around twice as fast.

Nico
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 198 bytes
Desc: This is a digitally signed message part.
URL: <http://erlang.org/pipermail/erlang-bugs/attachments/20110329/eff9a8c8/attachment.bin>

From bob@REDACTED  Tue Mar 29 16:29:38 2011
From: bob@REDACTED (Bob Ippolito)
Date: Tue, 29 Mar 2011 10:29:38 -0400
Subject: [erlang-bugs] Re: UTF8 string handling in different erlang:***
	functions
In-Reply-To: <201103291610.35983.kruber@zib.de>
References: <201103291506.57110.kruber@zib.de>
 <AANLkTik22seCeSn2kUj+NZBSzmteMrN-D+-o3z2QSx2A@mail.gmail.com>
 <201103291610.35983.kruber@zib.de>
Message-ID: <AANLkTikeJXM6tdnyx49_9AOLzAsN+05NorgCfmZCJoiu@mail.gmail.com>

On Tue, Mar 29, 2011 at 10:10 AM, Nico Kruber <kruber@REDACTED> wrote:
> On Tuesday 29 March 2011 15:17:03 Bob Ippolito wrote:
>> On Tue, Mar 29, 2011 at 9:06 AM, Nico Kruber <kruber@REDACTED> wrote:
>> > is it possible that UTF8 strings are not supported by both
>> > erlang:md5/1 and
>> > erlang:list_to_binary/1 (and possibly more?)
>> >
>> > I'm getting a bad argument exception when running the following:
>> >> erlang:md5("W?grain (W?gr??)").
>> >
>> > ** exception error: bad argument
>> > ? ? in function ?erlang:md5/1
>> > ? ? ? ?called as
>> > erlang:md5([87,224,103,114,97,105,110,32,40,87,229,103,114,335,227,
>> > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?41])
>> >
>> > even simpler, one can call:
>> >> erlang:md5([256]).
>> >
>> > ** exception error: bad argument
>> > ? ? in function ?erlang:md5/1
>> > ? ? ? ?called as erlang:md5([256])
>> >
>> >
>> > for characters larger than 255, this exception is thrown. same for
>> > erlang:list_to_binary/1.
>> >
>> > Both state that the input should be an iodata() or iolist() which are
>> > defined as:
>> >
>> > iodata() = iolist() | binary()
>> > iolist() = [char() | binary() | iolist()]
>> > % ?a binary is allowed as the tail of the list
>> >
>> > And according to
>> > http://www.erlang.org/doc/reference_manual/typespec.html
>> > a character is any valid integer between 0 and 16#10ffff and it should be
>> > this way since erlang strings are unicode strings.
>> >
>> > If this is correct behaviour, then how do I hash a unicode string without
>> > using erlang:term_to_binary/1 (which is possibly costly and should be
>> > unnecessary).
>>
>> What you have is not UTF8, because UTF8 is defined over bytes
>> (0..255).
>
> oh, right - this was maybe misleading, I should have rather said "erlang
> string"
>
>> IIRC, the actual definition of iolist should be
>> maybe_improper_list(byte() | binary() | iolist(), binary()). Functions
>> like erlang:list_to_binary/1 and erlang:md5/1 also only make sense
>> over bytes.
>
> ok, makes sense, although it is rather inconvenient not being able to hash
> strings :(

The real lesson here is "do not use erlang strings". Binaries in UTF8
are better for most use cases that I've come across in the past few
years. A bit uglier in the source, but the memory and performance
benefits make it worthwhile.

>> You can convert a list of unicode code points (L) to UTF8 with
>> unicode:characters_to_binary(L, utf8).
>
> ok, thanks for the tip - FYI, I ran a simple benchmark executing
> unicode:characters_to_binary/1 and erlang:term_to_binary/1 a Million times
> with the same string which resulted in the following:
>
>> 1000000 iterations of "erlang:term_to_binary/1" took 0.02946s:
> 33944331.2966734541/s
>> 1000000 iterations of "unicode:characters_to_binary/1" took 0.667519s:
> 1498084.69871269591/s
>
> -> looks like I should chose erlang:term_to_binary/1 since at least on my
> machine is is around twice as fast.

I guess it depends on if you care what the result is... these
operations are completely different, and there's not even any
guarantee that erlang:term_to_binary/1 is always going to return the
same output for a given input... there is more than one possible
representation for a string in external term format, and the spec does
not guarantee that the implementation will do it any particular way.

-bob


From psa@REDACTED  Tue Mar 29 16:34:22 2011
From: psa@REDACTED (=?UTF-8?B?UGF1bG8gU8OpcmdpbyBBbG1laWRh?=)
Date: Tue, 29 Mar 2011 15:34:22 +0100
Subject: [erlang-bugs] Re: UTF8 string handling in different erlang:***
 functions
In-Reply-To: <201103291610.35983.kruber@zib.de>
References: <201103291506.57110.kruber@zib.de>
 <AANLkTik22seCeSn2kUj+NZBSzmteMrN-D+-o3z2QSx2A@mail.gmail.com>
 <201103291610.35983.kruber@zib.de>
Message-ID: <4D91EDEE.1070300@di.uminho.pt>

On 3/29/11 3:10 PM, Nico Kruber wrote:
>> 1000000 iterations of "erlang:term_to_binary/1" took 0.02946s:
> 33944331.2966734541/s
>> 1000000 iterations of "unicode:characters_to_binary/1" took 0.667519s:
> 1498084.69871269591/s
>
> ->  looks like I should chose erlang:term_to_binary/1 since at least on my
> machine is is around twice as fast.
It's not twice but 20 times as fast. Amazing. Even though it should be 
slower, this slower is surprising.

Regards,
Paulo


From kostis@REDACTED  Tue Mar 29 17:34:29 2011
From: kostis@REDACTED (Kostis Sagonas)
Date: Tue, 29 Mar 2011 18:34:29 +0300
Subject: [erlang-bugs] Re: UTF8 string handling in different erlang:***
 functions
In-Reply-To: <4D91EDEE.1070300@di.uminho.pt>
References: <201103291506.57110.kruber@zib.de>
 <AANLkTik22seCeSn2kUj+NZBSzmteMrN-D+-o3z2QSx2A@mail.gmail.com>
 <201103291610.35983.kruber@zib.de> <4D91EDEE.1070300@di.uminho.pt>
Message-ID: <4D91FC05.4090802@cs.ntua.gr>

Paulo S?rgio Almeida wrote:
> On 3/29/11 3:10 PM, Nico Kruber wrote:
>>> 1000000 iterations of "erlang:term_to_binary/1" took 0.02946s:
>> 33944331.2966734541/s
>>> 1000000 iterations of "unicode:characters_to_binary/1" took 0.667519s:
>> 1498084.69871269591/s
>>
>> ->  looks like I should chose erlang:term_to_binary/1 since at least 
>> on my
>> machine is is around twice as fast.
> It's not twice but 20 times as fast. Amazing. Even though it should be 
> slower, this slower is surprising.

I have trouble reproducing these numbers, both the 2 and the 20.
With the program at the end of this mail, on an x86_64, I get:

Eshell V5.8.3  (abort with ^G)
1> c(t).
{ok,t}
2> timer:tc(t, t2b, [1000000]).
{133505,ok}
3> timer:tc(t, c2b, [1000000]).
{636624,ok}

which makes the term_to_binary version about 4 times as fast on this 
machine.  On a 32-bit machine the difference is about 6 - 6.5 times.

Kostis

%%==============================================================
-module(t).

-export([t2b/1, c2b/1]).

-define(S, "some medium sized string here").

t2b(N) ->
   lists:foreach(fun (_) -> erlang:term_to_binary(?S) end, lists:seq(1,N)).

c2b(N) ->
   lists:foreach(fun (_) -> unicode:characters_to_binary(?S) end, 
lists:seq(1,N)).


From kruber@REDACTED  Tue Mar 29 17:57:06 2011
From: kruber@REDACTED (Nico Kruber)
Date: Tue, 29 Mar 2011 17:57:06 +0200
Subject: [erlang-bugs] Re: UTF8 string handling in different erlang:***
	functions
In-Reply-To: <4D91FC05.4090802@cs.ntua.gr>
References: <201103291506.57110.kruber@zib.de> <4D91EDEE.1070300@di.uminho.pt>
 <4D91FC05.4090802@cs.ntua.gr>
Message-ID: <201103291757.06312.kruber@zib.de>

  On Tuesday 29 March 2011 17:34:29 you wrote:
> Paulo S?rgio Almeida wrote:
> > On 3/29/11 3:10 PM, Nico Kruber wrote:

> > 
> > It's not twice but 20 times as fast. Amazing. Even though it should be
> > slower, this slower is surprising.
> 
> I have trouble reproducing these numbers, both the 2 and the 20.
> With the program at the end of this mail, on an x86_64, I get:
> 
> Eshell V5.8.3  (abort with ^G)
> 1> c(t).
> {ok,t}
> 2> timer:tc(t, t2b, [1000000]).
> {133505,ok}
> 3> timer:tc(t, c2b, [1000000]).
> {636624,ok}
> 
> which makes the term_to_binary version about 4 times as fast on this
> machine.  On a 32-bit machine the difference is about 6 - 6.5 times.
> 
> Kostis
> 
> %%==============================================================
> -module(t).
> 
> -export([t2b/1, c2b/1]).
> 
> -define(S, "some medium sized string here").
> 
> t2b(N) ->
>    lists:foreach(fun (_) -> erlang:term_to_binary(?S) end, lists:seq(1,N)).
> 
> c2b(N) ->
>    lists:foreach(fun (_) -> unicode:characters_to_binary(?S) end,
> lists:seq(1,N)).

the lists:seq(1,1000000) will additionally slow down the process as it will 
create the whole list at first
-> I used the following loop for my benchmark:
%%==============================================================
-spec iter(Count::pos_integer(), F::fun(() -> any()), Tag::string()) -> ok.
iter(Count, F, Tag) ->
    F(),
    Start = erlang:now(),
    iter_inner(Count, F),
    Stop = erlang:now(),
    ElapsedTime = timer:now_diff(Stop, Start) / 1000000.0,
    Frequency = Count / ElapsedTime,
    ct:pal("~p iterations of ~p took ~ps: ~p1/s~n",
           [Count, Tag, ElapsedTime, Frequency]),
    ok.

-spec iter_inner(Count::pos_integer(), F::fun(() -> any())) -> ok.
iter_inner(0, _) ->
    ok;
iter_inner(N, F) ->
    F(),
    iter_inner(N - 1, F).
%%==============================================================

regarding 2 vs 20: I simply misread the numbers :(

Nico
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 198 bytes
Desc: This is a digitally signed message part.
URL: <http://erlang.org/pipermail/erlang-bugs/attachments/20110329/b96195d4/attachment.bin>

From emile@REDACTED  Tue Mar 29 19:05:53 2011
From: emile@REDACTED (Emile Joubert)
Date: Tue, 29 Mar 2011 18:05:53 +0100
Subject: [erlang-bugs] Re: crypto from windows service?
In-Reply-To: <Pine.LNX.4.64.1103291604540.10609@arwen.otp.ericsson.se>
References: <4D91C68B.2070702@rabbitmq.com>
 <Pine.LNX.4.64.1103291604540.10609@arwen.otp.ericsson.se>
Message-ID: <4D921171.3010707@rabbitmq.com>

Hi Patrik,

On 29/03/11 15:08, pan@REDACTED wrote:
> Hi!
>
> I am unable to reproduce the problem, but a wild guess would be that the
> openssl libraries (dll's) get messed up in some way by the small
> differences in process creation when you connect the stdout/stdin to a
> pipe. Have you tried updating openssl on the machine? What happens if

I've been able to reproduce the same problem on a separate new 32bit 
Windows XP SP3 install with R14B02 and Win32 OpenSSL v0.9.8r Light.

> you specify debugtype console? Does anything show up in the debug log or

Setting -debugtype to console pops up a console when the service starts 
(which is inconvenient for a service under normal circumstances). 
Starting the crypto app from the popped up console or from a remotely 
attached node does work however. The problem in my report only occurs if 
debugtype is set to new or reuse, or if any stopaction is specified.

> in the event viewer when the node crashes? Does the node really crash or
> is it only the connection that fails?

Nothing useful appears in the OS event viewer after a crash. The node 
really crashes, not just the connection. It is impossible to establish 
new connections.


Regards

Emile


From kruber@REDACTED  Wed Mar 30 12:04:58 2011
From: kruber@REDACTED (Nico Kruber)
Date: Wed, 30 Mar 2011 12:04:58 +0200
Subject: [erlang-bugs] Re: UTF8 string handling in different erlang:***
	functions
In-Reply-To: <Pine.LNX.4.64.1103301132500.10609@arwen.otp.ericsson.se>
References: <201103291506.57110.kruber@zib.de>
 <201103291757.06312.kruber@zib.de>
 <Pine.LNX.4.64.1103301132500.10609@arwen.otp.ericsson.se>
Message-ID: <201103301205.02403.kruber@zib.de>

On Wednesday 30 March 2011 11:47:28 Patrik Nyblom wrote:
> Hi!
> 
> To properly measure this, one has to bear in mind that
> erlang:term_to_binary(<constant expression>) gets evaluated at compile
> time, while unicode:characters_to_binary(<constant expression>) does not.

that's what I was thinking, too, but haven't had time to work around yet

> Using this program:
> -------------------
> t2bfun() ->
>      fun(X) -> erlang:term_to_binary(X) end.
> c2bfun() ->
>      fun(X) -> unicode:characters_to_binary(X,unicode) end.
> 
> iter(Count, F, String, Tag) ->
>      {_,Red0} = erlang:process_info(self(),reductions),
>      F(String),
>      {_,Red1} =  erlang:process_info(self(),reductions),
>      io:format("Reductions for one call: ~w~n",[Red1 - Red0]),
>      Start = erlang:now(),
>      iter_inner(Count, F, String),
>      Stop = erlang:now(),
>      ElapsedTime = timer:now_diff(Stop, Start) / 1000000.0,
>      Frequency = Count / ElapsedTime,
>      ct:pal("~p iterations of ~p took ~ps: ~p1/s~n",
>             [Count, Tag, ElapsedTime, Frequency]),
>      ok.
> 
> iter_inner(0, _,_) ->
>      ok;
> iter_inner(N, F, String) ->
>      F(String),
>      iter_inner(N - 1, F, String).
> ------------------
> doing:
> ------------------
> 26>
> StringWUnicode="jkl?fdsakf??dskfd??saadakfd??sakfd??sjkl?fdsakf??dskfd??saa
> dakfd??sakfd??sjkl?fdsakf??dskfd??saadakfd??sakfd??sjkl?fdsakf??dskfd??saad
> akfd??sakfd??sjkl?fdsakf??dskfd??saadakfd??sakfd??sjkl?fdsakf??dskfd??saada
> kfd??sakfd??s".
> "jkl?fdsakf??dskfd??saadakfd??sakfd??sjkl?fdsakf??dskfd??saadakfd??sakfd??
> sjkl?fdsakf??dskfd??saadakfd??sakfd??sjkl?fdsakf??dskfd??saadakfd??sakfd??s
> jkl?fdsakf??dskfd??saadakfd??sakfd??sjkl?fdsakf??dskfd??saadakfd??sakfd??s"
> 27> t:iter(1000000,t:c2bfun(),StringWUnicode,c2b).
> Reductions for one call: 30
> ----------------------------------------------------
> 2011-03-30 11:36:25.808
> 1000000 iterations of c2b took 3.280548s: 304827.120346966431/s
> 
> 
> ok
> 28> t:iter(1000000,t:t2bfun(),StringWUnicode,t2b).
> Reductions for one call: 4
> ----------------------------------------------------
> 2011-03-30 11:36:38.837
> 1000000 iterations of t2b took 1.72605s: 579357.49254077231/s
> 
> 
> ok
> 29> KostisString="some medium sized string here".
> "some medium sized string here"
> 30> t:iter(1000000,t:c2bfun(),KostisString,c2b).
> Reductions for one call: 6
> ----------------------------------------------------
> 2011-03-30 11:37:34.952
> 1000000 iterations of c2b took 0.543842s: 1838769.34845046891/s
> 
> 
> ok
> 31> t:iter(1000000,t:t2bfun(),KostisString,t2b).
> Reductions for one call: 4
> ----------------------------------------------------
> 2011-03-30 11:37:41.658
> 1000000 iterations of t2b took 0.362457s: 2758947.9579646691/s
> 
> 
> ok
> -----------------------
> - You get more correct measurements, showing a 2 to 3 speedup using
> term_to_binary.

-----------------------
using these tests, I get a similar result of around 2 speedup:
5> String2 = "qwertzuiopasdfghjklyxcvbnm" ++ 
[246,252,228,87,224,103,114,97,105,110,32,40,87,229,103,114,335,227,41].
[113,119,101,114,116,122,117,105,111,112,97,115,100,102,103,
 104,106,107,108,121,120,99,118,98,110,109,246,252,228|...]
6>  t:iter(1000000,t:c2bfun(),String2,c2b).                                                                          
Reductions for one call: 8
----------------------------------------------------
2011-03-30 11:56:05.669
1000000 iterations of c2b took 0.701959s: 1424584.6267374591/s


ok
7> 
7>  t:iter(1000000,t:t2bfun(),String2,c2b). 
Reductions for one call: 4
----------------------------------------------------
2011-03-30 11:56:14.630
1000000 iterations of c2b took 1.296981s: 771021.31796842061/s


ok
-----------------------

(I had to add a character larger than 255 manually as ??? are all below 256 
(246, 228, 252) - at least on my platform)

> The reasons are many:
> 1) unicode:characters_to_binary is a well behaved bif consuming
> reductions, which also means that it has to be more elaborate when
> allocating, because it may be interrupted. This is more of a problem in
> the ancient erlang:term_to_binary bif than one in the unicode bif.
> 2) unicode:characters_to_binary does more elaborate range checking, it
> only allows *valid* unicode characters, as described in the standard.
> 3) unicode:characters_to_binary may need some optimization, but using
> gprof, I find no really low hanging fruit.

> They are both bleading fast, so unless you plan to do huge amounts of md5
> calculations, my humble opinion is that you should use the one that suits
> your problem.

no, I'm perfectly fine with unicode:characters_to_binary (if speedup is only 
at 2)

Nico
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 198 bytes
Desc: This is a digitally signed message part.
URL: <http://erlang.org/pipermail/erlang-bugs/attachments/20110330/fcc6c17e/attachment.bin>

From philippu@REDACTED  Wed Mar 30 12:26:54 2011
From: philippu@REDACTED (Philipp Unterbrunner)
Date: Wed, 30 Mar 2011 12:26:54 +0200
Subject: [erlang-bugs] Re: [erlang-bugs 10] Re: Distributed node crashes
 silently when initially receiving a big chunk of messages from another node
In-Reply-To: <Pine.LNX.4.64.1103291522070.10609@arwen.otp.ericsson.se>
References: <4D652468.7000404@inf.ethz.ch> <4D906541.2060506@inf.ethz.ch>
 <Pine.LNX.4.64.1103291522070.10609@arwen.otp.ericsson.se>
Message-ID: <4D93056E.2050608@inf.ethz.ch>

I do not have a reasonably small demo yet, but I managed to get some
coredumps of beam.smp. The nodes crash with a segfault at
hipe_mode_switch.c, line 244 (of R14B02). This is code that is
responsible for calling a native code closure.

My application code does indeed send a few closures via messages, that
are later called by the receiver node. I do not use hot code upgrades
however, and the crashes are timing-related, as described before. I
therefore suspect the crashes are the result of a race condition
involving whatever code is responsible for making a received fun callable.

Philipp


On 03/29/2011 03:26 PM, pan@REDACTED wrote:
> Hi!
>
> This sounds really bad! A demo application that reproduces the bug
> would be really nice.
>
> Have you tried to enable core dumps to see if the erlang node crashes
> with a segfault? I suppose there are no erl_crash.dump files left
> after the crash that I can look at either?
>
> Any way to reproduce it would make it more easy to find!
>
> Cheers,
> /Patrik
>
> On Mon, 28 Mar 2011, Philipp Unterbrunner wrote:
>
>> The bug persists in r14b02.
>>
>> If I find time, I will make a small demo application so that others can
>> reproduce the bug.
>>
>> Philipp
>>
>> On 02/23/2011 04:14 PM, Philipp Unterbrunner wrote:
>>> Hello,
>>>
>>> I have run into a serious and very annoying bug.
>>>
>>> Affects (at least); R13B04, R14A, R14B, R14B01
>>> Platform: Ubuntu Linux 10.10, kernel 2.6.35-25-server (SMP)
>>>
>>> When a newly started distributed node receives a high number of
>>> messages from another node, the newly started node crashes silently.
>>> Nothing is printed to the console. No crash dump or core dump is
>>> produced.
>>>
>>> In trying to find a work-around, I found the following curious
>>> behavior:
>>>
>>> * The bug *only* occurs for distributed nodes (but regardless of
>>> whether the nodes run on the same machine).
>>> * Waiting a few seconds (or even longer) before sending the first
>>> message to the newly started node does *not* make a difference. The
>>> node will still crash when confronted with a large number of
>>> incoming messages later.
>>> * Speed matters. When doing a debug build, the bug appears less
>>> often then when doing a release build, especially when HiPE is
>>> enabled. However, I managed to cause the bug even in debug mode, and
>>> when OTP was not compiled with native libs. The bug is simply much
>>> less likely to be observed.
>>> * The number of messages sent *initially* matters most. Slowly
>>> "ramping up" the load is a work-around. Once a node is working at
>>> high throughput, it is OK to stop sending messages for an arbitrary
>>> period and at a later point send a big chunk of messages that would
>>> have killed the node if sent initially.
>>> * Timing matters. Running the receiver node with +T 7 or higher
>>> makes the problem disappear.
>>> * Setting the sender node's distribution buffer size to the minimum
>>> (+zdbbl 1) makes the problem appear less often.
>>>
>>> I have reproduced the bug in various applications. The behavior
>>> described above also makes it fairly obvious that the application is
>>> not at fault.
>>>
>>> Rather, it appears that the receiver node is unable to buffer
>>> incoming messages and crashes. Of particular interest here is the
>>> fact that "ramping up" the load is a work-around. I suspect a
>>> low-level race condition where the receiver node does not allocate
>>> sufficient buffer space in time and crashes.
>>>
>>> Given that the existing work-arounds are not desirable ("ramp up"
>>> requires changes to the application code, +T 7 and +zdbbl 1 decrease
>>> performance), and given that the bug now persists over multiple
>>> releases, I hope someone can soon look into it.
>>>
>>> Thank you,
>>>
>>> Philipp
>>

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 262 bytes
Desc: OpenPGP digital signature
URL: <http://erlang.org/pipermail/erlang-bugs/attachments/20110330/4898962e/attachment.bin>

From emile@REDACTED  Wed Mar 30 13:35:12 2011
From: emile@REDACTED (Emile Joubert)
Date: Wed, 30 Mar 2011 12:35:12 +0100
Subject: [erlang-bugs] Re: crypto from windows service?
In-Reply-To: <Pine.LNX.4.64.1103301149300.10609@arwen.otp.ericsson.se>
References: <4D91C68B.2070702@rabbitmq.com>
 <Pine.LNX.4.64.1103291604540.10609@arwen.otp.ericsson.se>
 <4D921171.3010707@rabbitmq.com>
 <Pine.LNX.4.64.1103301149300.10609@arwen.otp.ericsson.se>
Message-ID: <4D931570.2050203@rabbitmq.com>

On 30/03/11 11:00, Patrik Nyblom wrote:
> Hi!
>
> You're absolutely right in that you should not use debugtype console for
> production, I was just trying to narrow down the problem. The fact that
> console works points to that something get's messed up for
> crypto/openssl when the stdout/stdin file descriptors get assigned to
> pipes/files...
>
> A few quest(ion)s:
>
> Do you get an erl_crash.dump somewhere when it crashes? Do you get nothing
> at all in the event viewer, or...?

There is no erl_crash.dump in sight. The event viewer contains nothing 
at the time of the crash. Subsequent attempts to stop the service lead 
to entries some time after the crash:

test: Using TerminateProcess to kill erlang.

> The actual debug log, does that contain anything?

The debuglog contains nothing beyond the opening banner:

Eshell V5.8.3  (abort with ^G)
(test@REDACTED)1>

> I've not used that particular OpenSSL version, could you try upgrading
> to the latest OpenSSL (and also install the redistributables needed)?
> Please also uninstall any other OpenSSL versions. WinXP and different
> DLL versions when running as a service is known to cause problems

I've been able to reproduce the bug using the "Win32 OpenSSL v1.0.0d 
Light" OpenSSL libraries instead of "Win32 OpenSSL v0.9.8r Light" on 2 
separate XP SP3 32bit machines. The problem manifests in an identical 
manner. One of the machines has Visual Studio installed, for which no 
redistributable is necessary and the other had the redistributable 
installed.

> unfortunately :( Also please instal the OpenSSL DLL's in the Windows
> directory when asked by the installer.

This was done in all cases.

---

I'd be interested to know what OS, Erlang and OpenSSL version you are 
using so that I can try that. I've not been able to reproduce the 
problem under Windows7. So far it appears to be XP-specific.


Regards

Emile


From sverker@REDACTED  Wed Mar 30 14:59:14 2011
From: sverker@REDACTED (Sverker Eriksson)
Date: Wed, 30 Mar 2011 14:59:14 +0200
Subject: [erlang-bugs] Re: [erlang-bugs 10] Re: Distributed node crashes
 silently when initially receiving a big chunk of messages from another node
In-Reply-To: <4D93056E.2050608@inf.ethz.ch>
References: <4D652468.7000404@inf.ethz.ch> <4D906541.2060506@inf.ethz.ch>
 <Pine.LNX.4.64.1103291522070.10609@arwen.otp.ericsson.se>
 <4D93056E.2050608@inf.ethz.ch>
Message-ID: <4D932922.9000206@erix.ericsson.se>

We have one known hipe-bug. I haven't merged it to dev yet, but you can 
get it from

https://github.com/sverker/otp/commit/b715c077a88d5ba68e4e79b32c1c0de087234bbf

It's a "minor" heap corruption related to binary matching. Could be 
worth trying even though we haven't confirmed it as the cause of any faults.

/Sverker, Erlang/OTP


Philipp Unterbrunner wrote:
> I do not have a reasonably small demo yet, but I managed to get some
> coredumps of beam.smp. The nodes crash with a segfault at
> hipe_mode_switch.c, line 244 (of R14B02). This is code that is
> responsible for calling a native code closure.
>
> My application code does indeed send a few closures via messages, that
> are later called by the receiver node. I do not use hot code upgrades
> however, and the crashes are timing-related, as described before. I
> therefore suspect the crashes are the result of a race condition
> involving whatever code is responsible for making a received fun callable.
>
> Philipp
>
>
> On 03/29/2011 03:26 PM, pan@REDACTED wrote:
>   
>> Hi!
>>
>> This sounds really bad! A demo application that reproduces the bug
>> would be really nice.
>>
>> Have you tried to enable core dumps to see if the erlang node crashes
>> with a segfault? I suppose there are no erl_crash.dump files left
>> after the crash that I can look at either?
>>
>> Any way to reproduce it would make it more easy to find!
>>
>> Cheers,
>> /Patrik
>>
>> On Mon, 28 Mar 2011, Philipp Unterbrunner wrote:
>>
>>     
>>> The bug persists in r14b02.
>>>
>>> If I find time, I will make a small demo application so that others can
>>> reproduce the bug.
>>>
>>> Philipp
>>>
>>> On 02/23/2011 04:14 PM, Philipp Unterbrunner wrote:
>>>       
>>>> Hello,
>>>>
>>>> I have run into a serious and very annoying bug.
>>>>
>>>> Affects (at least); R13B04, R14A, R14B, R14B01
>>>> Platform: Ubuntu Linux 10.10, kernel 2.6.35-25-server (SMP)
>>>>
>>>> When a newly started distributed node receives a high number of
>>>> messages from another node, the newly started node crashes silently.
>>>> Nothing is printed to the console. No crash dump or core dump is
>>>> produced.
>>>>
>>>> In trying to find a work-around, I found the following curious
>>>> behavior:
>>>>
>>>> * The bug *only* occurs for distributed nodes (but regardless of
>>>> whether the nodes run on the same machine).
>>>> * Waiting a few seconds (or even longer) before sending the first
>>>> message to the newly started node does *not* make a difference. The
>>>> node will still crash when confronted with a large number of
>>>> incoming messages later.
>>>> * Speed matters. When doing a debug build, the bug appears less
>>>> often then when doing a release build, especially when HiPE is
>>>> enabled. However, I managed to cause the bug even in debug mode, and
>>>> when OTP was not compiled with native libs. The bug is simply much
>>>> less likely to be observed.
>>>> * The number of messages sent *initially* matters most. Slowly
>>>> "ramping up" the load is a work-around. Once a node is working at
>>>> high throughput, it is OK to stop sending messages for an arbitrary
>>>> period and at a later point send a big chunk of messages that would
>>>> have killed the node if sent initially.
>>>> * Timing matters. Running the receiver node with +T 7 or higher
>>>> makes the problem disappear.
>>>> * Setting the sender node's distribution buffer size to the minimum
>>>> (+zdbbl 1) makes the problem appear less often.
>>>>
>>>> I have reproduced the bug in various applications. The behavior
>>>> described above also makes it fairly obvious that the application is
>>>> not at fault.
>>>>
>>>> Rather, it appears that the receiver node is unable to buffer
>>>> incoming messages and crashes. Of particular interest here is the
>>>> fact that "ramping up" the load is a work-around. I suspect a
>>>> low-level race condition where the receiver node does not allocate
>>>> sufficient buffer space in time and crashes.
>>>>
>>>> Given that the existing work-arounds are not desirable ("ramp up"
>>>> requires changes to the application code, +T 7 and +zdbbl 1 decrease
>>>> performance), and given that the bug now persists over multiple
>>>> releases, I hope someone can soon look into it.
>>>>
>>>> Thank you,
>>>>
>>>> Philipp
>>>>         
>
>   
> ------------------------------------------------------------------------
>
> _______________________________________________
> erlang-bugs mailing list
> erlang-bugs@REDACTED
> http://erlang.org/mailman/listinfo/erlang-bugs
>   


From emile@REDACTED  Wed Mar 30 16:44:35 2011
From: emile@REDACTED (Emile Joubert)
Date: Wed, 30 Mar 2011 15:44:35 +0100
Subject: [erlang-bugs] Re: crypto from windows service?
In-Reply-To: <Pine.LNX.4.64.1103301511480.10609@arwen.otp.ericsson.se>
References: <4D91C68B.2070702@rabbitmq.com>
 <Pine.LNX.4.64.1103291604540.10609@arwen.otp.ericsson.se>
 <4D921171.3010707@rabbitmq.com>
 <Pine.LNX.4.64.1103301149300.10609@arwen.otp.ericsson.se>
 <4D931570.2050203@rabbitmq.com>
 <Pine.LNX.4.64.1103301511480.10609@arwen.otp.ericsson.se>
Message-ID: <4D9341D3.4000503@rabbitmq.com>

Hi Patrick,

On 30/03/11 15:04, Patrik Nyblom wrote:
> Hi!
>
> I'm using Windows XP SP3 as well, but using OpenSSL 0.9.7e. If I upgrade
> to the latest on WinXP, I get the same symptom as you (Yes!).
>
> Could you verify that your problems disappear if you use 0.9.7e? I've
> attached the installer to this mail.

Yes, I can confirm the problems disappear with 0.9.7e. Using the older 
version of OpenSSL is a reasonable workaround until a better solution is 
found - thanks for that.


Regards

Emile


From eric.pailleau@REDACTED  Thu Mar 31 21:53:59 2011
From: eric.pailleau@REDACTED (PAILLEAU Eric)
Date: Thu, 31 Mar 2011 21:53:59 +0200
Subject: [erlang-bugs] Missing init:get_args() function ?
Message-ID: <4D94DBD7.2000306@wanadoo.fr>

$> erl -sname titi -config toto

Erlang R14B02 (erts-5.8.3) [source] [smp:2:2] [rq:2] [async-threads:0]
[hipe] [kernel-poll:false]

Eshell V5.8.3  (abort with ^G)
1> init:get_args().
** exception error: undefined function init:get_args/0

while

2> init:get_plain_arguments().
[]

Looks like init:get_args() does not exists while still in last
documentation.

Did I miss something in doc, or is it a bug ?

Regards.