Distributed Erlang

11 Distributed Erlang

11.1 Distributed Erlang System

A distributed Erlang system consists of a number of Erlang runtime systems communicating with each other. Each such runtime system is called a node. Message passing between processes at different nodes, as well as links and monitors, are transparent when pids are used. Registered names, however, are local to each node. This means the node must be specified as well when sending messages etc. using registered names.

The distribution mechanism is implemented using TCP/IP sockets. How to implement an alternative carrier is described in ERTS User's Guide.

11.2 Nodes

A node is an executing Erlang runtime system which has been given a name, using the command line flag -name (long names) or -sname (short names).

The format of the node name is an atom name@host where name is the name given by the user and host is the full host name if long names are used, or the first part of the host name if short names are used. node() returns the name of the node. Example:

% erl -name dilbert
(dilbert@uab.ericsson.se)1> node().
'dilbert@uab.ericsson.se'

% erl -sname dilbert
(dilbert@uab)1> node().
dilbert@uab

Note!
A node with a long node name cannot communicate with a node with a short node name.

11.3 Node Connections

The nodes in a distributed Erlang system are loosely connected. The first time the name of another node is used, for example if spawn(Node,M,F,A) or net_adm:ping(Node) is called, a connection attempt to that node will be made.

Connections are by default transitive. If a node A connects to node B, and node B has a connection to node C, then node A will also try to connect to node C. This feature can be turned off by using the command line flag -connect_all false, see erl(1).

If a node goes down, all connections to that node are removed. Calling erlang:disconnect(Node) will force disconnection of a node.

The list of (visible) nodes currently connected to is returned by nodes().

11.4 epmd

The Erlang Port Mapper Daemon epmd is automatically started at every host where an Erlang node is started. It is responsible for mapping the symbolic node names to machine addresses. See epmd(1).

11.5 Hidden Nodes

In a distributed Erlang system, it is sometimes useful to connect to a node without also connecting to all other nodes. An example could be some kind of O&M functionality used to inspect the status of a system without disturbing it. For this purpose, a hidden node may be used.

A hidden node is a node started with the command line flag -hidden. Connections between hidden nodes and other nodes are not transitive, they must be set up explicitly. Also, hidden nodes does not show up in the list of nodes returned by nodes(). Instead, nodes(hidden) or nodes(connected) must be used. This means, for example, that the hidden node will not be added to the set of nodes that global is keeping track of.

This feature was added in Erlang 5.0/OTP R7.

11.6 C Nodes

A C node is a C program written to act as a hidden node in a distributed Erlang system. The library Erl_Interface contains functions for this purpose. Refer to the documentation for Erl_Interface and Interoperability Tutorial for more information about C nodes.

11.7 Security

Nodes are protected by a magic cookie system. When a connection attempt is made from a node A to a node B, node A provide the cookie that has been set for node B. If the magic cookie is not correct, the connection attempt is rejected.

If the node is started with the command line flag -setcookie Cookie, then the cookie will be assumed to be the atom Cookie for all other nodes.

If the node is started without the command line flag, then the node will create the cookie from the contents of the file $HOME/.erlang.cookie, where $HOME is the user's home directory. If the file does not exist, it will be created and contain a random character sequence.

Which magic cookie to use when connecting to another node can be changed by calling erlang:set_cookie/2.

11.8 Distribution BIFs

Some useful BIFs for distributed programming, see erlang(3) for more information:

*Distribution BIFs.*
`erlang:disconnect_node(Node)`	Forces the disconnection of a node.
`erlang:get_cookie()`	Returns the magic cookie of the current node.
`is_alive()`	Returns `true` if the runtime system is a node and can connect to other nodes, `false` otherwise.
`monitor_node(Node, true\|false)`	Monitor the status of `Node`. A message `{nodedown, Node}` is received if the connection to it is lost.
`node()`	Returns the name of the current node. Allowed in guards.
`node(Arg)`	Returns the node where `Arg`, a pid, reference, or port, is located.
`nodes()`	Returns a list of all visible nodes this node is connected to.
`nodes(Arg)`	Depending on `Arg`, this function can return a list not only of visible nodes, but also hidden nodes and previously known nodes, etc.
`set_cookie(Node, Cookie)`	Sets the magic cookie used when connecting to `Node`. If `Node` is the current node, `Cookie` will be used when connecting to all new nodes.
`spawn[_link\|_opt](Node, Fun)`	Creates a process at a remote node.
`spawn[_link\|opt](Node, Module, FunctionName, Args)`	Creates a process at a remote node.

11.9 Distribution Command Line Flags

Examples of command line flags used for distributed programming, see erl(1) for more information:

*Distribution Command Line Flags.*
`-connect_all false`	Only explicit connection set-ups will be used.
`-hidden`	Makes a node into a hidden node.
`-name Name`	Makes a runtime system into a node, using long node names.
`-setcookie Cookie`	Same as calling `erlang:set_cookie(node(), Cookie)`.
`-sname Name`	Makes a runtime system into a node, using short node names.

11.10 Distribution Modules

Examples of modules useful for distributed programming:

In Kernel:

*Kernel Modules Useful For Distribution.*
`global`	A global name registration facility.
`global_group`	Grouping nodes to global name registration groups.
`net_adm`	Various Erlang net administration routines.
`net_kernel`	Erlang networking kernel.

In STDLIB:

*STDLIB Modules Useful For Distribution.*
`slave`	Start and control of slave nodes.