[erlang-questions] models for replicated/distributed processors

Edmond Begumisa ebegumisa@REDACTED
Thu Mar 8 08:52:15 CET 2012

I spent some time trying to gather wisdom on this subject too...

In regards to fault-tolerant design with Erlang, I found Joe's paper on  
the Bluetail Mail Robustifier to be worth a million dollars (probably my  
favorite Erlang paper, I wish I had read it first before any Erlang books  
or tutorials):

    "Increasing the reliability of email services" by Joe Armstrong.

The following thread on processes and fault tolerance got me some great  


    ... especially Joe's description of take-over/error-recovery
    (the "replicated, synchronized processes" you refer to)...

    ... and also Ulf's description of hot/cold standby and choosing the
    right recover state:

Hope those help.

- Edmond -

On Tue, 06 Mar 2012 02:36:35 +1100, Miles Fidelman  
<mfidelman@REDACTED> wrote:

> Hi Folks,
> I'm trying to think through various approaches to fault-tolerance for an  
> actor-based system architecture - generally around the notion of  
> replicated copies of actors operating on different nodes.
> Two questions to the assembled wisdom:
> 1. Has anybody done any work with replicated, synchronized processes  
> spread across multiple erlang nodes?  If so, can you share anything  
> about architectural concepts?  (Pointers to papers or slide decks would  
> be much appreciated).
> 2. A more specific question:  I notice that spawn-link has a from that  
> allows creating a linked process on a different node - spawn_link(Node,  
> Module, Function, Args) -> pid() - which seems like a good start on  
> building supervision trees across nodes.  But... there doesn't seem to  
> be an equivalent form of spawn_monitor - seems like spawn_opt/5 can be  
> used instead, but sort of curious about the omission.  Anybody know the  
> story?
> Thanks,
> Miles Fidelman

Using Opera's revolutionary e-mail client: http://www.opera.com/mail/

More information about the erlang-questions mailing list