[erlang-questions] net_adm:names vs. erlang:nodes
Szoboszlay Dániel
dszoboszlay@REDACTED
Wed Nov 27 23:37:20 CET 2013
Hi,
Regarding:
> control3> names().
> {ok, [{'control',34285}]
> control4> net_adm:nodes().
> ['streamer@REDACTED']
I assume they meant to be net_adm:names() and nodes(). The former returns
the name and server port of all nodes on the current host, regardless
whether they are connected to control, while the latter returns nodes that
are connected to control regardless whether they are on the same or a
remote host.
Regarding the one-way connectivity issues: I think your problem is caused
by the algorithm Erlang uses to avoid duplicated connections between
nodes. When two nodes attempt to connect to each other simultaneously,
there's a race condition. We don't want two connections between them (e.g.
one initiated by control to streamer and one from streamer to control).
The decision is that the lesser wins (by comparing the node names): the
connection attempt of the greater node is refused by the lesser.
When local connects to both control and streamer, these two nodes will
both attempt to connect automatically to form a fully connected mash. And
according to the above described scenario control will refuse the incoming
connection attempt of streamer believing that streamer will accept its own
attempt[1]. However, due to your network layout, initiating a TCP
connection in that direction is impossible, so the request will eventually
time out and the two nodes remain disconnected.
When you issue net_adm:ping('control') on streamer you force a new
connection attempt. This one is however not mutual, so it will succeed.
I had a very similar situation with some server nodes (let's say s1, s2,
s3) and a debug node called d with a firewall between them. d could
connect to the server hosts, but not the other way around. When I pinged
s1 from d they connected, but d didn't connect to s2 or s3 automatically.
(And btw. those servers flooded the logs with reports from failed
connection attempts to d). The workaround is that I renamed my debug node
to z, so it's name is now larger than the servers'.
It's quite a hackish solution, but if one-way firewalls cause problems in
your network you might try creatively renaming your nodes (streamer ->
accelerated_streamer? control -> streamer_control?).
BR,
Daniel
[1] In fact, I think control will not even try to connect. It's sure it
would loose the race, so why bother?
On Wed, 27 Nov 2013 12:48:53 -0000, Des Small <small@REDACTED> wrote:
> Hi,
>
> I have been having some problems with getting Erlang nodes to see each
> other on a complex network. I start all node with long names and raw
> numerical IP addresses, since I also can't rely on DNS to do the right
> thing.
>
> I have three nodes: my local node ("local"); a "control" node; and a
> "streamer" node that sends data over a fast network link.
>
> Now, "control" can't see "streamer" using their specified IP addresses:
>
> control> net_adm:ping('streamer').
>
> returns 'pang', and that's reasonable because traceroute(1) *also* can't
> find a path to the IP address used for 'streamer'. But IP traffic the
> other way works fine and 'streamer' *can* net_adm:ping('control') and
> get 'pong', and after that repeating the net_adm:ping('streamer') from
> 'control' *does* work.
>
> So we have the folling interaction:
>
> """
> control1>net_adm:ping('streamer').
> pang
> streamer1>net_adm:ping('control').
> pong
> control2>net_adm:ping('streamer').
> pong
> """
> (All hostname atoms above are really of the form <host>@<ip address>.)
>
> It took quite a while to realise that manually pinging from 'streamer'
> to 'control' would set the network up correctly; both are visible from
> 'local' and I was pinging them from there successfully without getting a
> link from 'control' to 'streamer'.
>
> Continuing, we discover that erlang:names on 'control' *doesn't* have a
> port for 'streamer'.
>
> control3> names().
> {ok, [{'control',34285}]
>
> but net_adm:nodes *does* see it.
>
> control4> net_adm:nodes().
> ['streamer@REDACTED']
>
> So, I am mostly asking after the fact how I should have known this, and
> how to ask erlang itself what route it is actually using between
> 'control' and 'streamer'? I'd cheefully settle for a hint as to which
> part of the FM I should be R'ing!
>
> (I am not currently asking whether our network topology is particularly
> optimal, although I am of course open to opinions.)
>
> Cheers,
>
> Des
More information about the erlang-questions
mailing list