[Erlang Systems]

4 Corrective Maintenance

There are a couple of problems that may occur and are related to the installation of the system rather than to faulty Erlang code.

4.1 Trouble-shooting

If the Erlang runtime system fails to start properly, the most likely cause is either that Erlang has been installed in an incorrect way, or that the advanced Erlang shell is unable to deal with the terminal type in use. The latter case can be easily identified, since Erlang in that case can be started with erl -oldshell (but the advanced editing features will, of course, not work). If this occurs, please submit a problem report, and be sure to report the environment $TERM definition.

A typical example of an incorrect installation is where the top level directory is incorrectly specified to the Install script. Check the path name specified for Rootdir in the installed erl script. If this is incorrect, rerun the Install script with the correct path name.

Users running the Sun OpenWindows should note that the advanced shell does not work correctly in cmdtool or shelltool if the scrollbar is activated. This is due to deficiencies in these programs. Either deactivate the scrollbar, use erl -oldshell, or use, for example, xterm instead.

Using distributed Erlang is simpler if the environment is set up to use the DNS system for host name/address lookups, but that is not required. Further details about trouble-shooting for this case can be found below.

4.2 Trouble-shooting Distributed Erlang

There are a number of things that can be wrong when starting distributed Erlang. If you cannot understand the following, ask your system administrator for help.

Distributed Erlang can be started either with the -name or the -sname flag.

To use distributed Erlang in a Wide Area Network environment it is necessary to use the -name flag when starting nodes. If this is done, Erlang will use the mechanism for lookup of IP-addresses of hosts as specified at the installation.

If in subsequent operations Erlang is supplied with a node name, e.g. foobar@super.eua.ericsson.se, Erlang will contact epmd (Erlang Port Mapper Daemon) at the host super.eua.ericsson.se in order to find the address of the node called foobar there.

At installation of Erlang, it is specified by the administrator installing the system only if DNS should be used for address lookup, or if the mechanism provided by the underlying operation system will be used (the latter may very well use DNS only, or a combination of DNS and lookups in the /etc/hosts file).

If DNS is used, all hosts involved must be properly configured for DNS, see the UNIX manual page named(8). The following is a list of some of the things that can go wrong on a UNIX system where DNS is supposed to work:

  1. Non existing or erroneous /etc/resolv.conf file. If this file is does not exist in your system, contact your system administrator, or run Erlang on a computer which has this file.

  2. Host not registered with DNS name server. To check if your host is known to a name server use the program /usr/etc/nslookup (SunOS 4) or /usr/sbin/nslookup(Solaris 2),

      % nslookup
      Default Server: super.eua.ericsson.se
      Address: 134.138.199.16
      > gin
      Server: super.eua.ericsson.se
      Address: 134.138.199.16
      Name: gin.eua.ericsson.se
      Address: 134.138.199.53
      > xi-term1
      Server: super.eua.ericsson.se
      Address: 134.138.199.16
      *** super.eua.ericsson.se can't find xi-term1:
      Non-existent domain
      > exit
      %
    The code above is an example session with nslookup. First a question about the host gin is asked. This is ok. Then a question about the host xi-term1 is asked. This is a host at the site known to NIS, but named did not recognize the host. For this reason, distributed Erlang cannot run at all on the host xi-term1.

  3. The portnumber of epmd, see epmd(3), is already used by an other program. In Erlang R2A (Erlang 4.5) port 4368 is used for epmd. If this port is used by an other program, distributed Erlang cannot run. The following checks if port 4368 is already used (use /usr/ucb/netstat on SunOS 4, and /usr/bin/netstat on Solaris 2):

      % epmd -kill
      Killed
      % sleep 120
      % /usr/ucb/netstat -a | grep 4368
      %
      
    The code above does the following:
    1. Aborts epmd at the host. The epmd exec file is under the erlang/bin directory

    2. Is inactive for a period to let tcp connections disappear

    3. Checks if any other program is using the port. If the netstat command produces any output, there is an error message. This means that another program is occupying the epmd port. If possible, remove this program and try again.

    The file /etc/services must also be checked:

      % cat /etc/services | grep 4368
      % ypcat services | grep 4368
      
    If any of the above commands produce any output, there is a problem as it is not possible to choose a port number on the command line. The only solution is to try to move the other obstructing program to an other tcp/port number.

If DNS is not used at your site at all, Erlang should not be installed with the option of using DNS only.

The epmd program can be used for checking a host in a simlar way to nslookup.

  % epmd -hinfo netsim-server.tei.ericsson.se
  official host name: netsim-server.tei.ericsson.se
  addr type = 2, addr length = 4
  Internet address: 141.137.93.20
  Cant't get hostbyaddr() on host 20.93.137.141.in-addr.arpa
  Bad IPaddr == 141.137.93.20
  % epmd -hinfo super
  official host name: super.eua.ericsson.se
  addr type = 2, addr length = 4
  Internet address: 134.138.199.16
  %

The above is a transcript from a session with epmd. The first host called netsimserver.tei.ericsson.se apparently had some problems.

If distributed Erlang is run with the -sname flag, it can be run in an environment where DNS is not running at all. If the /etc/resolv.conf file is not present, the resolver library routines to gethostbyname() and gethostbyaddr() will resort to reading the /etc/hosts file, which must then contain the IP address and names of all hosts which are to be used for distributed Erlang applications. The /etc/hosts file can also contain the names of non local hosts (over a WAN), but if name lookup is used over a WAN, there may be problems if the initial part of a host are the same in two different domains. If Erlang is run over a WAN, DNS is thus the recommended method.

A configuration where it is necessary to use the -sname flag since DNS is not needed, could be a set of target nodes that run disconnected from any network; for example, a laptop computer attached to the targets (possibly via SLIP). In that case it is not reasonable to require DNS.

Another example is a a set of nodes on a (not-networked) LAN with a small set of hosts, and DNS is not desirable or possible. Then the option of reading /etc/hosts might be appropriate.


Copyright © 1991-2000 Ericsson Utvecklings AB