[erlang-questions] Automatically reconnecting nodes when they come back online
Joseph Wayne Norton
Fri Apr 26 19:13:11 CEST 2013
I don't have a direct answer to your question.
However, are you aware of the slave module?
Some of the recipe(s) in this module might be of use to you.
On 2013/04/27, at 2:00, Scott Thoman <scott@REDACTED> wrote:
> To all who know more about this than I do:
> First, I'm just beginning to learn about Erlang/OTP so I figured I'd
> use to implement something useful.
> Part of what I'd like to build will involve a "conductor" controller
> node that directs some other "player" nodes to all do something at
> approximately the same time - ultimately to actually test the
> operation of another piece of distributed software. As part of those
> operations, I expect the player nodes may sometimes crash (actually
> cause a Windows BSOD in some cases) and then eventually come back to
> What I'm wondering about is what some folks have found to be good ways
> of getting nodes to rejoin the cluster when they come back to life.
> They way I'm thinking about it now, is that the player nodes will be
> passive in the sense that they won't actively connect to any other
> nodes - they'll only get connected when the conductor node invites
> them in. I'm also not looking for fault tolerance on the conductor
> node at this point; if that one fails badly I'll just get some coffee
> and rerun the scenario again.
> My first two thoughts were:
> 1. When the conductor node connects up the player nodes it would also
> spawn a process whose sole job is to periodically ping the other nodes
> to ensure they're connected. Then when one goes down, those pings
> will just fail during that time but when the node comes back a ping
> will reconnect it to the other nodes. All this time, I'd be
> monitoring the node up/down messages.
> 2. I'd start by monitoring all the nodes as the conductor connects
> them and when receiving a node down message, spawn a process whose job
> it is to periodically ping only that node only until it comes back.
> Are there some good practices out there for systems that want to
> behave like this?
> Thanks in advance,
> erlang-questions mailing list
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the erlang-questions