[erlang-questions] Trouble with ct_slave node initialization

Sargun Dhillon sargun@REDACTED
Sun Feb 7 07:06:31 CET 2016


So, I think the problem stems from ct_slave's use of
inet:gethostname():
https://github.com/erlang/otp/blob/maint/lib/common_test/src/ct_slave.erl#L382-L391.
Even when dispatched for long names, it's effectively using
inet:gethostname() --
https://github.com/erlang/otp/blob/maint/lib/kernel/src/net_adm.erl#L75-L80.
That in turns calls the http://linux.die.net/man/2/gethostname, which
just returns what it gets from uname.

So, in theory, shortname should work as long as we start up our node
like chris@$(uname -n). Then, when it tries to connect to the node,
it'll connect to it as chris4@$(uname -n), and it'll try to start a
local node without triggering SSH.

Of course that doesn't work. Because in turn when ct_slave starts the
node, the command it generates to start up erlang looks closer to
this: -home /home/ubuntu -- -noshell -noinput -noshell -noinput
-setcookie SCPOZKCLKFHQSBVEAKEF -sname chris4. It "calculates" that
the name of the slave is going to be chris4@$(uname -n) (based upon
the result of gethostname). When it starts, it calculates the hostname
using inet_db:gethostname --
https://github.com/erlang/otp/blob/maint/lib/kernel/src/net_kernel.erl#L1242-L1269.
This gets it from inet_config:
https://github.com/erlang/otp/blob/maint/lib/kernel/src/inet_config.erl#L47-L86.
Which, initially sets it via inet:gethostname(), and then a
inet:gethostbyname on that name. Given this divergence, we get two
different node names.

The only obvious way I see to alleviate this, other than configure the
utsname correctly, is ensure there is an /etc/hosts file entry for
127.0.0.1, and it should have the uname first, before localhost so
it's the primary hostname, and not an alias in the gethostbyname call.
Unfortunately, in many systems, this file is controlled by an outside
entity so that's a no-go.

I wish that common test would split node(), and get the host part, and
use that to drive starting the slave at -sname chris4@$HOST. Opinions?
At least add an option for it?

On Fri, Jan 29, 2016 at 5:01 PM, Christopher Meiklejohn
<christopher.meiklejohn@REDACTED> wrote:
> I'm seeing the following behavior where ct_slave fails to initialize
> the node within the boot_timeout, however, the node is being
> registered via epmd and is pingable after initialization.
>
> (chris2@REDACTED)9> ct_slave:start('chris3@REDACTED').
> {error,boot_timeout,chris3@REDACTED@box545}
> (chris2@REDACTED)10> net_adm:ping('chris3@REDACTED').
> pong
>
> (chris2@REDACTED)11> net_adm:ping('chris4').
> pang
> (chris2@REDACTED)12> ct_slave:start('chris4').
> {error,boot_timeout,chris4@REDACTED}
> (chris2@REDACTED)13> net_adm:ping('chris4').
> pang
> (chris2@REDACTED)14>
>
> (chris2@REDACTED)16> net_adm:ping('chris4@REDACTED').
> pong
>
> The boxes hostname is box545, however Erlang is running with
> distribution on, but with only shortnames enabled.
>
> Thanks,
> Christopher
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://erlang.org/mailman/listinfo/erlang-questions



More information about the erlang-questions mailing list