<div class="gmail_extra">Hi Olivier,</div><div class="gmail_extra"><br></div><div class="gmail_extra">In distributed computing, we see the collective effort of individual -- but not independent -- nodes. Communicating nodes, at any point in time, need to be connected and should have the knowledge of the other, including the availability status. The OAM node, or leader, will have the total picture of all the nodes.</div>
<div class="gmail_extra"><br></div><div class="gmail_extra">You have to look at the bigger picture of the whole system, when it comes to stopping a distributed system. The following steps are generally taken to tear down a system.</div>
<div class="gmail_extra"><br></div><div class="gmail_extra">* Signal all the nodes of a forthcoming shutdown request</div><div class="gmail_extra">* Stop accepting new requests</div><div class="gmail_extra">* Finish servicing the accepted requests</div>
<div class="gmail_extra">* Do the stock taking and clean-up</div><div class="gmail_extra">* Check the readiness of all the nodes for shutdown</div><div class="gmail_extra">* Then call shutdown on all the nodes.</div><div class="gmail_extra">
<br></div><div class="gmail_extra">Here, I am looking from top into the system. The shutdown process will be generally coordinated by a single process.</div><div class="gmail_extra"><br></div><div class="gmail_extra">An ad-hock shutdown is more scary to me in a production environment.</div>
<div class="gmail_extra"><br></div><div class="gmail_extra">Kind Regards,</div><div class="gmail_extra">Kannan.</div><div class="gmail_extra"><br></div><div class="gmail_extra"><br></div><div class="gmail_extra"><br></div>
<div class="gmail_extra"><br><div class="gmail_quote">On Tue, Apr 24, 2012 at 5:22 PM, Olivier BOUDEVILLE <span dir="ltr"><<a href="mailto:olivier.boudeville@edf.fr" target="_blank">olivier.boudeville@edf.fr</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<br><font face="sans-serif">Hi,</font>
<br>
<br><font face="sans-serif">For a more controlled overall termination
of a distributed application, I try to shutdown synchronously a series
of nodes, as properly and as in parallel as possible, in a non-OTP program.
I imagine that using '[ rpc:cast( N, erlang, halt, [] ) || N <- MyTargetNodes
]' and then waiting for them to be terminated is the best approach for
that.</font>
<br>
<br><font face="sans-serif">As I want now these terminations to
be synchronous (i.e. I want my terminate function to return only when all
nodes are down for sure), I used to rely on checking their termination
using net_adm:ping/1 (waiting for pong to become pang), but kept on getting
(systematically) 'noconnection' errors (exceptions?), which do not seem
to be catchable (at least not with a 'try .. catch T:E ->.. end' clause).
This happens as soon as there is at least one node (which happens to be
on the same host - of course it is not the local node from which that rpc:cast
is triggered) to halt.</font>
<br>
<br><font face="sans-serif">I switched to looping on 'lists:member(
Nodename, nodes() )' instead of ping (in both case with a proper waiting
between checks), but I still get 'noconnection' errors. It looks like 'noconnection'
is VM-level? As expected, commenting-out the rpc:cast/3 never leads to
'noconnection'. </font>
<br>
<br><font face="sans-serif">I feel I would need something like net_kernel:unconnect_node/1.</font>
<br>
<br><font face="sans-serif">My question now: how to deal gracefully
with such a synchronous node shutdown and to resist to the (intended) loss
of node(s)? </font>
<br>
<br><font face="sans-serif">Thanks in advance for any hint!</font>
<br><font face="sans-serif">Best regards,</font>
<br><font face="sans-serif"><br>
Olivier.<br>
---------------------------<br>
Olivier Boudeville<br>
<br>
EDF R&D : 1, avenue du Général de Gaulle, 92140 Clamart, France<br>
Département SINETICS, groupe ASICS (I2A), bureau B-226<br>
Office : <a href="tel:%2B33%201%2047%2065%2059%2058" value="+33147655958" target="_blank">+33 1 47 65 59 58</a> / Mobile : <a href="tel:%2B33%206%2016%2083%2037%2022" value="+33616833722" target="_blank">+33 6 16 83 37 22</a> / Fax : +33 1 47
65 27 13</font><p></p>
<p><br>
Ce message et toutes les pièces jointes (ci-après le 'Message') sont établis à l'intention exclusive des destinataires et les informations qui y figurent sont strictement confidentielles. Toute utilisation de ce Message non conforme à sa destination, toute diffusion ou toute publication totale ou partielle, est interdite sauf autorisation expresse.</p>
<p>Si vous n'êtes pas le destinataire de ce Message, il vous est interdit de le copier, de le faire suivre, de le divulguer ou d'en utiliser tout ou partie. Si vous avez reçu ce Message par erreur, merci de le supprimer de votre système, ainsi que toutes ses copies, et de n'en garder aucune trace sur quelque support que ce soit. Nous vous remercions également d'en avertir immédiatement l'expéditeur par retour du message.</p>
<p>Il est impossible de garantir que les communications par messagerie électronique arrivent en temps utile, sont sécurisées ou dénuées de toute erreur ou virus.<br>
____________________________________________________</p>
<p>This message and any attachments (the 'Message') are intended solely for the addressees. The information contained in this Message is confidential. Any use of information contained in this Message not in accord with its purpose, any dissemination or disclosure, either whole or partial, is prohibited except formal approval.</p>
<p>If you are not the addressee, you may not copy, forward, disclose or use any part of it. If you have received this message in error, please delete it and all copies from your system and notify the sender immediately by return message.</p>
<p>E-mail communication cannot be guaranteed to be timely secure, error or virus-free.</p><br>_______________________________________________<br>
erlang-questions mailing list<br>
<a href="mailto:erlang-questions@erlang.org">erlang-questions@erlang.org</a><br>
<a href="http://erlang.org/mailman/listinfo/erlang-questions" target="_blank">http://erlang.org/mailman/listinfo/erlang-questions</a><br>
<br></blockquote></div><br></div>