[erlang-questions] crash dump at ejabberd startup

Michael Santos <>
Fri Nov 19 00:03:55 CET 2010


On Thu, Nov 18, 2010 at 08:55:40PM +0100,  wrote:

> Ok, to be precisely I repeated the test:
> 
> I first started tcpdum and let it run.
> I then started epmd -d -d -d in a second shell.
> I finally started erl -name foo in a third shell.

Awesome, this is exactly what I needed to see! Thanks for being so
patient with this.

> epdm -d -d -d now spit out something more informative:

> ***** 00000000  00 10 78 7c e5 4d 00 00  05 00 05 00 03 66 6f 6f  
> |..x|.M.......foo|
> ***** 00000010  00 00                                             |..|
> epmd: Thu Nov 18 19:26:44 2010: time in seconds: 1290108404
> epmd: Thu Nov 18 19:26:44 2010: ** got ALIVE2_REQ
> epmd: Thu Nov 18 19:26:44 2010: ALIVE2_REQ from non local address
> epmd: Thu Nov 18 19:26:44 2010: closing connection on file descriptor 4

So it's confirmed the problem was that the source address was not the
one epmd expects.

> The inetrc file I was using was:
> 
> {lookup,[file, dns]}.
> {host,{64,120,5,168}, ["mail.kepos.org"]}.
> {file, resolv, "/etc/resolv.conf"}.

inetrc is used for DNS resolution. It won't affect this particular
issue.

> tcpdump: WARNING: lo0: no IPv4 address assigned
> tcpdump: listening on lo0, link-type NULL (BSD loopback), capture size 96 
> bytes
> 19:34:01.445154 IP (tos 0x0, ttl 64, id 42836, offset 0, flags [DF], proto 
> TCP (6), length 60, bad cksum 0 (->728)!)
>     mail.58928 > mail.4369: Flags [S], cksum 0x1c18 (correct), seq 
> 1795656002, win 65535, options [mss 16344,nop,wscale 3,sackOK,TS val 
> 221277388 ecr 0], length 0

> As you asked for mail.kepos.org and 64.120.5.168:

> So, I still guess, there might be a problem as Erlang somehow insists on 
> using localhost solely while this isn't a good thing for FreeBSD Jails as 
> Jails just have no fully functionable localhost (127.0.0.1 and locahost 
> exist and answer for pings, yes, but there are limitations nonetheless).

Thanks! I've never used FreeBSD jails, so I was confused about how they
work. Erlang nodes are hard coded to connect to an epmd port on 127.0.0.1.
The jail apparently re-writes connections to localhost with the IP address
of the interface the jail is bound to. This behaviour is really sort of
nasty ;)

> If there was a way to make Erlang use any configurable IP instead of 
> localhost, the issue almost probably was resolved. 

I've attached a patch that disables the check.

I'll put together a better patch later. I can see a few ways of doing
this:

1. have a configurable source IP address as you suggested

2. checking the source IP is the same as the destination IP

3. checking the connection came over the loopback interface (probably no
portable way to do this)

4. have an option to disable the check (the old behaviour)

Aside from jails, I'm not sure anyone else would be affected by this. So
maybe option 4 is the way to go.


diff --git a/erts/epmd/src/epmd_srv.c b/erts/epmd/src/epmd_srv.c
index ef471a4..e2cc2dc 100644
--- a/erts/epmd/src/epmd_srv.c
+++ b/erts/epmd/src/epmd_srv.c
@@ -766,6 +766,9 @@ static int conn_open(EpmdVars *g,int fd)
       dbg_tty_printf(g,2,(s->local_peer) ? "Local peer connected" :
 		     "Non-local peer connected");
 
+      /* XXX allow local messages from all clients */
+      s->local_peer = EPMD_TRUE;
+
       s->want = 0;		/* Currently unknown */
       s->got  = 0;
       s->mod_time = current_time(g); /* Note activity */


More information about the erlang-questions mailing list