[erlang-questions] crash dump at ejabberd startup

Michael Santos michael.santos@REDACTED
Wed Nov 17 00:22:02 CET 2010


On Tue, Nov 16, 2010 at 09:36:49PM +0100, tom@REDACTED wrote:

> I have a process running by unix user "ejabberd", yes:
> # ps axu
> ejabberd  2355  0.0  0.1  3448  1284  ??  SJ    5:54AM   0:00.14 
> /usr/local/lib/erlang/erts-5.8.1/bin/epmd -daemon
> 
> I guess, this process was started by the ejabberdctl skript when it tried to 
> launch ejabberd.
> 
> EPMD tries to connect to "localhost" ?

The Erlang node connects to epmd (on 127.0.0.1:4369). The TCP connection
is working, but the socket is closed immmediately.

> Then we are on the right track now I 
> guess FreeBSD jails have only sort of a "half own localhost" and thus one 
> has to configure the Jails IP address instead of localhost / 127.0.0.1 
> instead (what usually works for all other daemons).

Which version of Erlang are you using? R14B?

epmd in R14B was changed to allow some messages (like name registrations)
only from 127/8. 

> Where can I configure Erlang to use the Jails IP instead localhost/127.0.0.1 

> ? In that mysterious "inetrc" config file ?

inetrc is used for hostname resolution. See:

http://www.erlang.org/doc/apps/erts/inet_cfg.html

> However, there must be a way as I earlier had it running already in another 
> test host (half a year ago). Just cannot remember the details on how I did 
> it.
>  
> > Check epmd is running and if the node is allowed to contact it from the
> > jail. You can get debugging info by running epmd manually: epmd -d -d -d
> 
> Ok, did that as
> # kill -9 2355
> # /usr/local/lib/erlang/erts-5.8.1/bin/epmd -d -d -d
> epmd: Tue Nov 16 20:29:59 2010: epmd running - daemon = 0
> epmd: Tue Nov 16 20:29:59 2010: try to initiate listening port 4369
> epmd: Tue Nov 16 20:29:59 2010: starting
> epmd: Tue Nov 16 20:29:59 2010: entering the main select() loop
> epmd: Tue Nov 16 20:30:04 2010: time in seconds: 1289939404
> epmd: Tue Nov 16 20:30:09 2010: time in seconds: 1289939409
> epmd: Tue Nov 16 20:30:14 2010: time in seconds: 1289939414
> epmd: Tue Nov 16 20:30:20 2010: time in seconds: 1289939420
> epmd: Tue Nov 16 20:30:25 2010: time in seconds: 1289939425
> epmd: Tue Nov 16 20:30:30 2010: time in seconds: 1289939430
> epmd: Tue Nov 16 20:30:35 2010: time in seconds: 1289939435
> epmd: Tue Nov 16 20:30:40 2010: time in seconds: 1289939440
> ...
> ...
> ^C

Was epmd started up inside the jail? Did you bring up ejabberd as well?
I don't see any registration attempts.

> ["inet_tcp",{{badmatch,{error,epmd_close}},[{inet_tcp_dist,listen,1},
> {'EXIT',nodistribution}},{offender,[{pid,undefined},{name,net_kernel},

> is it sure the error message above means the connection between erlang
> and localhost:4369 ?

Just a guess :)

Here is the error message:

{badmatch,{error,epmd_close}}

When the ejabberd node starts up, it connects to 127.0.0.1:4369 and sends
a EPMD_ALIVE2_REQ message to register its name and distribution port (an
ephemeral port).

The code that does this is in inet_tcp_dist:listen/1 which calls
erl_epmd:register_node/2.

{ok, Creation} = erl_epmd:register_node(Name, Port)

epmd closes the connection immediately (possibly because the
EPMD_ALIVE2_REQ message is not allowed) and register_node/2 returns
{error, epmd_close}, causing the badmatch.

You can get the same error by starting up 2 erlang nodes (kill epmd if
it is running):

$ erl
Erlang R14B01 (erts-5.8.2) [source] [smp:2:2] [rq:2] [async-threads:0]
[hipe] [kernel-poll:false]

Eshell V5.8.2  (abort with ^G)
1> {ok, L} = gen_tcp:listen(4369, [{active, false}]).
2> {ok, S} = gen_tcp:accept(L), ok = gen_tcp:close(S).

And in another shell:

$ erl -name test
{error_logger,{{2010,11,16},{18,12,6}},"Protocol: ~p: register error: ~p~n",["inet_tcp",{{badmatch,{error,epmd_close}},[{inet_tcp_dist,listen,1},{net_kernel,start_protos,4},{net_kernel,start_protos,3},{net_kernel,init_node,2},{net_kernel,init,1},{gen_server,init_it,6},{proc_lib,init_p_do_apply,3}]}]}




More information about the erlang-questions mailing list