Dynamic Node Additions

Per Bergqvist <>
Wed Dec 11 10:00:33 CET 2002


Hi,                                                                   
                                                                      
... [snip] ...                                                        
                                                                      
>                                                                     
> (We were looking at downing 1 node, loading the relevant boot       
scripts etc and then bringing it up again, then downing node 2 doing  
the same, the caveat however is that every application must be run on 
at least two nodes, and both those nodes must not go down             
simultaneously).                                                      
>                                                                     
                                                                      
(If all nodes providing a service are down there is not much to do, is
it ?).                                                                
                                                                      
Is this a SASL distributed applications ?                             
I experienced severe problems with the distributed application at a   
customer site earlier this spring.                                    
My analysis was that the distributed application controller and it's  
underlying protocol is broken.                                        
It is really easy to get the distributed application controller into  
deadlock states when two nodes start at the same time (e.g. reboot    
after a power failure on two identical hosts).                        
                                                                      
Another bizarro side effect is that dist_ac always stops the active   
running instance of the application in a distributed cluster of nodes 
and starts it on the last started node.                               
                                                                      
My personal view is: use non distributed applications and roll your   
own interlocking and failover mechanism.                              
                                                                      
/Per                                                                  
                                                                      
=========================================================             
Per Bergqvist                                                         
Synapse Systems AB                                                    
Phone: +46 709 686 685                                                
Email:                                                    



More information about the erlang-questions mailing list