erlang crashes on slave:start in gentoo
"Gösta Ask (Mobile Arts)"
gosta.ask@REDACTED
Thu Dec 30 09:32:02 CET 2004
[this is a resend, using my .se adress. Maybe the previous post, sent
yesterday using the .com adress, was taken out by some filter...]
Hi,
I reported this behavior before, see
http://www.erlang.org/ml-archive/erlang-questions/200404/msg00184.html
but I got no answers at that time. I realize you may see this as a
question about a specific OS/machine configuration. Thus of no general
interest to the Erlang community. There was one thing I did not
notice at first though, when I asked erlang-questions. A crash dump
is generated. So even if it is OS-specific I think it is odd that
Erlang crashes.
I get the same behavior on my gentoo machine at home with a 2.6 kernel
(standard out-of-the-box configuration) and a freshly emerged R10B-0
Erlang. Maybe it is something to take a look at?
I would be very grateful if you have the time to do that, and at
least give me some hints what to look for in the setup of my machine.
I need to use the slave module to build our application locally.
Everything else works just fine.
rgds,
/Gösta Ask (at Mobile Arts dot com)
==============================================================
The logs
==============================================================
On a Solaris machine, postiljon, the slave node starts fine:
bash-2.05$ which erl
/opt/MA/otp/R9C-0/bin/erl
bash-2.05$ erl -sname foo
Erlang (BEAM) emulator version 5.3 [source] [hipe]
Eshell V5.3 (abort with ^G)
(foo@REDACTED)1> net_adm:ping(bar@REDACTED).
pong
(foo@REDACTED)2> nodes().
[bar@REDACTED]
(foo@REDACTED)3> slave:start(postiljon, foobar).
{ok,foobar@REDACTED}
(foo@REDACTED)4> nodes().
[bar@REDACTED,foobar@REDACTED]
(foo@REDACTED)7> slave:stop(foobar@REDACTED).
ok
(foo@REDACTED)8> nodes().
[bar@REDACTED]
(foo@REDACTED)9>
(foo@REDACTED)10> inet_db:get_rc().
[{domain,"mobilearts.local"},{nameserver,{192,168,211,3}}]
(foo@REDACTED)11> inet_db:res_option(lookup).
[file,dns]
User switch command
--> q
bash-2.05$ uname -a
SunOS postiljon 5.9 Generic_112233-10 sun4u sparc SUNW,UltraAX-i2
bash-2.05$
==============================================================
But on my gentoo machine (falcon) it fails:
askg@REDACTED askg $ which erl
/usr/local/bin/erl
askg@REDACTED askg $ erl -sname foo
Erlang (BEAM) emulator version 5.3.6.2 [source] [hipe]
Eshell V5.3.6.2 (abort with ^G)
(foo@REDACTED)1>
(foo@REDACTED)2> net_adm:ping(bar@REDACTED).
pong
(foo@REDACTED)3> nodes().
[bar@REDACTED]
[start the slave node in debug mode]
(foo@REDACTED)10> dbg:c(slave,start,[falcon, foobar]).
(<0.58.0>) init ! {<0.58.0>,{get_argument,progname}}
(<0.58.0>) out {init,request,1}
(<0.58.0>) << {init,{ok,[["erl"]]}}
(<0.58.0>) in {init,request,1}
(<0.58.0>) << timeout
(<0.58.0>) <0.18.0> ! {'$gen_call',{<0.58.0>,#Ref<0.0.0.308>},longnames}
(<0.58.0>) out {gen,wait_resp_mon,3}
(<0.58.0>) << {#Ref<0.0.0.308>,false}
(<0.58.0>) in {gen,wait_resp_mon,3}
(<0.58.0>) << timeout
(<0.58.0>) << timeout
(<0.58.0>) <0.18.0> ! {'$gen_call',{<0.58.0>,#Ref<0.0.0.310>},
{connect,normal,foobar@REDACTED}}
(<0.58.0>) out {gen,wait_resp_mon,3}
(<0.58.0>) << {#Ref<0.0.0.310>,false}
(<0.58.0>) in {gen,wait_resp_mon,3}
(<0.58.0>) << timeout
(<0.58.0>) <0.58.0> ! {'DOWN',#Ref<0.0.0.322>,
process,
{net_kernel,foobar@REDACTED},
noconnection}
(<0.58.0>) << {'DOWN',#Ref<0.0.0.322>,
process,
{net_kernel,foobar@REDACTED},
noconnection}
[garbage coll.]
(<0.58.0>) << timeout
(<0.58.0>) <0.18.0> ! {'$gen_call',{<0.58.0>,#Ref<0.0.0.323>},
{disconnect,foobar@REDACTED}}
(<0.58.0>) out {gen,wait_resp_mon,3}
(<0.58.0>) << {#Ref<0.0.0.323>,false}
(<0.58.0>) in {gen,wait_resp_mon,3}
(<0.58.0>) << timeout
[here is the call to spawn which tries to start the slave node]
(<0.58.0>) spawn <0.61.0> as slave:wait_for_slave(<0.58.0>,"falcon",foobar,foobar@REDACTED,[],no_link,erl)
(<0.58.0>) out {slave,start_it,6}
(<0.61.0>) in {slave,wait_for_slave,7}
(<0.61.0>) register slave_waiter_0
(<0.61.0>) << timeout
(<0.61.0>) <0.18.0> ! {'$gen_call',{<0.61.0>,#Ref<0.0.0.326>},longnames}
(<0.61.0>) out {gen,wait_resp_mon,3}
(<0.61.0>) << {#Ref<0.0.0.326>,false}
(<0.61.0>) in {gen,wait_resp_mon,3}
(<0.61.0>) << timeout
[garbage coll.]
(<0.61.0>) out {slave,wait_for_slave,7}
(<0.61.0>) in {slave,wait_for_slave,7}
(<0.61.0>) << timeout
[garbage coll.]
(<0.61.0>) << timeout
(<0.61.0>) <0.18.0> ! {'$gen_call',{<0.61.0>,#Ref<0.0.0.352>},
{connect,normal,foobar@REDACTED}}
(<0.61.0>) out {gen,wait_resp_mon,3}
(<0.61.0>) << {#Ref<0.0.0.352>,false}
(<0.61.0>) in {gen,wait_resp_mon,3}
(<0.61.0>) << timeout
(<0.61.0>) <0.61.0> ! {'DOWN',#Ref<0.0.0.361>,
process,
{net_kernel,foobar@REDACTED},
noconnection}
(<0.61.0>) << {'DOWN',#Ref<0.0.0.361>,
process,
{net_kernel,foobar@REDACTED},
noconnection}
(<0.61.0>) << timeout
(<0.61.0>) <0.18.0> ! {'$gen_call',{<0.61.0>,#Ref<0.0.0.362>},
{disconnect,foobar@REDACTED}}
(<0.61.0>) out {gen,wait_resp_mon,3}
(<0.61.0>) << {#Ref<0.0.0.362>,false}
(<0.61.0>) in {gen,wait_resp_mon,3}
[here is the timeout]
(<0.61.0>) << timeout
(<0.61.0>) <0.58.0> ! {result,{error,timeout}}
(<0.58.0>) << {result,{error,timeout}}
(<0.61.0>) exit normal
(<0.61.0>) unregister slave_waiter_0
(<0.58.0>) in {slave,start_it,6}
{error,timeout}
(foo@REDACTED)13> inet_db:get_rc().
[{domain,"mobilearts.local"},
{nameserver,{192,168,211,3}},
{nameserver,{80,252,160,162}},
{nameserver,{80,252,160,164}}]
(foo@REDACTED)14> inet_db:res_option(lookup).
[file,dns]
(foo@REDACTED)15>
askg@REDACTED askg $ uname -a
Linux falcon 2.4.20-gentoo-r6 #1 Fri Feb 27 10:59:40 CET 2004 i686 Pentium II (Deschutes) GenuineIntel GNU/Linux
an erl_crashdump is created:
askg@REDACTED askg $ ls -l erl*
-rw-r----- 1 askg users 155290 Dec 29 09:30 erl_crash.dump
(foo@REDACTED)16> crashdump_viewer:start().
WebTool is available at http://localhost:8888/
Or http://127.0.0.1:8888/
ok
showing, for example, under "General"
Slogan Kernel pid terminated (application_controller) (shutdown)
Node name 'nonode@REDACTED'
Crashdump created on Wed Dec 29 09:30:15 2004
and as "Processes":
Pid Name Spawned as State Reductions Stack+heap MsgQ Length
<0.0.0> init otp_ring0:start/2 Running 4621 6765 1
<0.2.0> erl_prim_loader erlang:apply/2 Waiting 9119 233 0
<0.4.0> error_logger proc_lib:init_p/5 Waiting 994 233 0
<0.12.0> global:init_the_locker/1 Waiting 4 233 0
and "Expand MsgQ" for <0.0.0> reports:
{'EXIT',<0.1.0>,
{noproc,{gen_server,call,
[application_controller,
{load_application,stdlib},
infinity]}}}
==============================================================
More information about the erlang-bugs
mailing list