[erlang-questions] net_adm:names vs. erlang:nodes

Wed Nov 27 23:37:20 CET 2013

Hi,

Regarding:

> control3> names().
> {ok, [{'control',34285}]

> control4> net_adm:nodes().
> ['streamer@REDACTED']

I assume they meant to be net_adm:names() and nodes(). The former returns  
the name and server port of all nodes on the current host, regardless  
whether they are connected to control, while the latter returns nodes that  
are connected to control regardless whether they are on the same or a  
remote host.

Regarding the one-way connectivity issues: I think your problem is caused  
by the algorithm Erlang uses to avoid duplicated connections between  
nodes. When two nodes attempt to connect to each other simultaneously,  
there's a race condition. We don't want two connections between them (e.g.  
one initiated by control to streamer and one from streamer to control).  
The decision is that the lesser wins (by comparing the node names): the  
connection attempt of the greater node is refused by the lesser.

When local connects to both control and streamer, these two nodes will  
both attempt to connect automatically to form a fully connected mash. And  
according to the above described scenario control will refuse the incoming  
connection attempt of streamer believing that streamer will accept its own  
attempt[1]. However, due to your network layout, initiating a TCP  
connection in that direction is impossible, so the request will eventually  
time out and the two nodes remain disconnected.

When you issue net_adm:ping('control') on streamer you force a new  
connection attempt. This one is however not mutual, so it will succeed.

I had a very similar situation with some server nodes (let's say s1, s2,  
s3) and a debug node called d with a firewall between them. d could  
connect to the server hosts, but not the other way around. When I pinged  
s1 from d they connected, but d didn't connect to s2 or s3 automatically.  
(And btw. those servers flooded the logs with reports from failed  
connection attempts to d). The workaround is that I renamed my debug node  
to z, so it's name is now larger than the servers'.

It's quite a hackish solution, but if one-way firewalls cause problems in  
your network you might try creatively renaming your nodes (streamer ->  
accelerated_streamer? control -> streamer_control?).

BR,
Daniel

[1] In fact, I think control will not even try to connect. It's sure it  
would loose the race, so why bother?

On Wed, 27 Nov 2013 12:48:53 -0000, Des Small <small@REDACTED> wrote:

> Hi,
>
> I have been having some problems with getting Erlang nodes to see each  
> other on a complex network.  I start all node with long names and raw  
> numerical IP addresses, since I also can't rely on DNS to do the right  
> thing.
>
> I have three nodes: my local node ("local"); a "control" node; and a  
> "streamer" node that sends data over a fast network link.
>
> Now, "control" can't see "streamer" using their specified IP addresses:
>
> control> net_adm:ping('streamer').
>
> returns 'pang', and that's reasonable because traceroute(1) *also* can't  
> find a path to the IP address used for 'streamer'.  But IP traffic the  
> other way works fine and 'streamer' *can* net_adm:ping('control') and  
> get 'pong', and after that repeating the net_adm:ping('streamer') from  
> 'control' *does* work.
>
> So we have the folling interaction:
>
> """
> control1>net_adm:ping('streamer').
> pang
> streamer1>net_adm:ping('control').
> pong
> control2>net_adm:ping('streamer').
> pong
> """
> (All hostname atoms above are really of the form <host>@<ip address>.)
>
> It took quite a while to realise that manually pinging from 'streamer'  
> to 'control' would set the network up correctly; both are visible from  
> 'local' and I was pinging them from there successfully without getting a  
> link from 'control' to 'streamer'.
>
> Continuing, we discover that erlang:names on 'control' *doesn't* have a  
> port for 'streamer'.
>
> control3> names().
> {ok, [{'control',34285}]
>
> but net_adm:nodes *does* see it.
>
> control4> net_adm:nodes().
> ['streamer@REDACTED']
>
> So, I am mostly asking after the fact how I should have known this, and  
> how to ask erlang itself what route it is actually using between  
> 'control' and 'streamer'?  I'd cheefully settle for a hint as to which  
> part of the FM I should be R'ing!
>
> (I am not currently asking whether our network topology is particularly  
> optimal, although I am of course open to opinions.)
>
> Cheers,
>
> Des