distributed application - replication of state

Thu Aug 25 13:59:51 CEST 2005

Hello,

I was experimenting with the 'dist' distributed application (from Ulf's 
"OTP Release Handling Tutorial") to implement controlled handling of 
network partitioning using a UDP heartbeat, and came up with the 
following question.

When there is a crash of one of two distributed nodes, or a network 
partition I would like to ensure that the second node takes over the 
state from the other node.  In current implementation, however, the 
state is taken over smoothly only when the application is running at 
secondary node and it gets started at the primary node.

Is there some common approach on how that state should be replicated 
between two nodes?  Do we need a separate application running on all 
nodes responsible for replication, that our distributed application 
would consult upon startup to initialize the state?  I suppose that the 
same problem had to be solved in mnesia.

Regards,

Serge

-- 
Serge Aleynikov
R&D Telecom, IDT Corp.
Tel: (973) 438-3436
Fax: (973) 438-1464
serge@REDACTED