Reliability of communication

Ulf Wiger etxuwig@REDACTED
Wed Feb 5 13:14:20 CET 2003


It should perhaps be mentioned that distributed erlang is
designed in such a way that you could implement your own
carrier, e.g. SSL, SCTP, SCI, SAAL, MTP/VIA, or other.

See the chapter on "How to implement an alternative carrier
for the erlang distribution" in the OTP documentation:

http://www.erlang.org/doc/r9b/erts-5.2/doc/html/alt_dist.html#3

The document contains an example that implements distributed
erlang over UNIX Domain Sockets.

Of course, to attempt this, you really should know what
you're doing. (:

Regarding error-detecting strategies, there is a periodic
tick which attempts to detect hanging nodes and
communication failures (one can set the frequency with the
'-kernel net_ticktime' command line option). It's not
advisable to set net_ticktime to less than 10 on a normal
UNIX system, in my opinion.

/Uffe

On Tue, 4 Feb 2003, Per Bergqvist wrote:

>The weakest part in distributed erlang is IMHO that is
>relies on TCP and the implementation of TCP is most (all
>major) operating systems.
>
>No problem when communicating within a single host but for
>inter-host comms you will see severe problems as soon as
>you start to pull cables.
>
>In order to build a real high availability system you need
>to make sure you have a high availability TCP solution
>either via fault tolerant hardware switches or software
>solutions supporting high availability on simple redundant
>switching hardware (i.e. what I presented on EUC 2001). The
>latter is at least as good as the hardware solutions around
>and is one or two orders of magnitude cheaper.
>
>The obviously best solution would be to use SCTP for
>inter-host communication. As soon as the lksctp guys get
>stable I plan to dig in to it.
>
>/Per
>
>> A question related to the development of safety-critical systems:
>>
>> Is inter-process and inter-node communication in Erlang
>> only as reliable as TCP/IP, or are any additional
>> error-detecting strategies used (e.g. hamming, cyclic or
>> polynomial codes)?
>>
>> Dominic.
>>
>=========================================================
>Per Bergqvist
>Synapse Systems AB
>Phone: +46 709 686 685
>Email: per@REDACTED
>

-- 
Ulf Wiger, Senior Specialist,
   / / /   Architecture & Design of Carrier-Class Software
  / / /    Strategic Product & System Management
 / / /     Ericsson Telecom AB, ATM Multiservice Networks




More information about the erlang-questions mailing list