[erlang-questions] rabbit, epmd and bonded interfaces woes
Leonard Boyce
leonard.boyce@REDACTED
Fri Jan 13 20:52:06 CET 2012
Hopefully someone has hit this issue and can shed some light.
Network interfaces are configured bonded in pairs
Rabbit crashes/dumps when trying to start with error;
"Protocol: ~p: register error: ~p~n",["inet_tcp",{{badmatch,
{error,epmd_close}}....<snip>
In my research to date the only similar issue I've been able to find is a
reference to running ejabberd in FreeBSD jails and the solution was to patch
epmd to allow all callers (not limit to 127.x.x.x), which is not really safe.
We've tried running epmd -d -d -d, calling from another term unsing "erl -
sname somestrangename" ading using tcpdump to inspect the connection and tcp
dump shows that "erl -sname somestrangename" seems to be calling from the
public interface
I have a sneaking suspicion that this has something to do with incorrect
handling of bonded interfaces as another server with exactly the same
OS/hardware and software versions (minus bonded interfaces) works perfectly.
We've tried R15B and results are exactly the same.
Any advice/help would be appreciated.
Thanks,
Leonard
---
Environment;
#############################################
Ubuntu 10.04 LTS
Linux web1 2.6.32-37-server #81-Ubuntu SMP Fri Dec 2 20:49:12 UTC 2011 x86_64
GNU/Linux
Erlang R14B03 (erts-5.8.4) [source] [64-bit] [smp:16:16] [rq:16] [async-
threads:0] [kernel-poll:false]
File: /etc/hostname
#############################################
web1
File: /etc/hosts
#############################################
127.0.0.1 web1 localhost
192.168.100.1 web1 web1.XXXXXX.XXX
192.168.100.83 web2 web2.XXXXXX.XXX
# The following lines are desirable for IPv6 capable hosts
::1 localhost ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
File: /etc/network/interfaces
#############################################
# The loopback network interface
auto lo
iface lo inet loopback
auto bond0
iface bond0 inet static
address XX.XX.XX.XX
netmask 255.255.255.240
gateway xx.xx.xx.xx
bond-slaves eth0 eth1
bond_mode 802.3ad
bond_miimon 100
bond_lacp_rate 1
auto bond1
iface bond1 inet static
address 192.168.100.1
netmask 255.255.255.0
bond-slaves eth2 eth3
bond_mode 802.3ad
bond_miimon 100
bond_lacp_rate 1
auto bond1:0
iface bond1:0 inet static
address 192.168.100.2
netmask 255.255.255.0
auto bond1:1
iface bond1:1 inet static
address 192.168.100.3
netmask 255.255.255.0
TCP Dump (sanitized XX.XX.XX.XX for public IP);
#############################################
leonard@REDACTED:~$ sudo tcpdump -i lo -vv port 4369
tcpdump: listening on lo, link-type EN10MB (Ethernet), capture size 96 bytes
14:41:27.873868 IP (tos 0x0, ttl 64, id 61955, offset 0, flags [DF], proto TCP
(6), length 60)
XX.XX.XX.XX.42982 > web1.4369: Flags [S], cksum 0xba6b (correct), seq
2754024620, win 32792, options [mss 16396,sackOK,TS val 24962735 ecr
0,nop,wscale 7], length 0
14:41:27.873884 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP (6),
length 60)
web1.4369 > web1.42982: Flags [S.], cksum 0xeb9d (correct), seq
4276970213, ack 2754024621, win 32768, options [mss 16396,sackOK,TS val
24962735 ecr 24962735,nop,wscale 7], length 0
14:41:27.873895 IP (tos 0x0, ttl 64, id 61956, offset 0, flags [DF], proto TCP
(6), length 52)
XX.XX.XX.XX.42982 > web1.4369: Flags [.], cksum 0x5897 (correct), seq
2754024621, ack 4276970214, win 257, options [nop,nop,TS val 24962735 ecr
24962735], length 0
14:41:27.873938 IP (tos 0x0, ttl 64, id 61957, offset 0, flags [DF], proto TCP
(6), length 82)
XX.XX.XX.XX.42982 > web1.4369: Flags [P.], seq 0:30, ack 1, win 257,
options [nop,nop,TS val 24962735 ecr 24962735], length 30
14:41:27.873945 IP (tos 0x0, ttl 64, id 33865, offset 0, flags [DF], proto TCP
(6), length 52)
web1.4369 > web1.42982: Flags [.], cksum 0xd3a4 (correct), seq 1, ack 31,
win 256, options [nop,nop,TS val 24962735 ecr 24962735], length 0
14:41:27.874143 IP (tos 0x0, ttl 64, id 33866, offset 0, flags [DF], proto TCP
(6), length 52)
web1.4369 > web1.42982: Flags [F.], cksum 0xd3a3 (correct), seq 1, ack 31,
win 256, options [nop,nop,TS val 24962735 ecr 24962735], length 0
14:41:27.874188 IP (tos 0x0, ttl 64, id 61958, offset 0, flags [DF], proto TCP
(6), length 52)
XX.XX.XX.XX.42982 > web1.4369: Flags [F.], cksum 0x5877 (correct), seq 30,
ack 2, win 257, options [nop,nop,TS val 24962735 ecr 24962735], length 0
14:41:27.874202 IP (tos 0x0, ttl 64, id 33867, offset 0, flags [DF], proto TCP
(6), length 52)
web1.4369 > web1.42982: Flags [.], cksum 0xd3a2 (correct), seq 2, ack 32,
win 256, options [nop,nop,TS val 24962735 ecr 24962735], length 0
epmd output;
#############################################
root@REDACTED:/usr/local/src# epmd -d -d -d
epmd: Fri Jan 13 14:41:25 2012: epmd running - daemon = 0
epmd: Fri Jan 13 14:41:25 2012: try to initiate listening port 4369
epmd: Fri Jan 13 14:41:25 2012: entering the main select() loop
epmd: Fri Jan 13 14:41:27 2012: Non-local peer connected
epmd: Fri Jan 13 14:41:27 2012: time in seconds: 1326483687
epmd: Fri Jan 13 14:41:27 2012: opening connection on file descriptor 4
epmd: Fri Jan 13 14:41:27 2012: time in seconds: 1326483687
epmd: Fri Jan 13 14:41:27 2012: got 30 bytes
***** 00000000 00 1c 78 b8 4a 4d 00 00 05 00 05 00 0f 73 6f 6d
|..x.JM.......som|
***** 00000010 65 73 74 72 61 6e 67 65 6e 61 6d 65 00 00 |
estrangename..|
epmd: Fri Jan 13 14:41:27 2012: time in seconds: 1326483687
epmd: Fri Jan 13 14:41:27 2012: ** got ALIVE2_REQ
epmd: Fri Jan 13 14:41:27 2012: ALIVE2_REQ from non local address
epmd: Fri Jan 13 14:41:27 2012: closing connection on file descriptor 4
More information about the erlang-questions
mailing list