[erlang-questions] Reconnect to nodes
Michael Truog
mjtruog@REDACTED
Fri Apr 22 20:25:08 CEST 2016
This is in CloudI (in cloudi_core) as shown at http://cloudi.org/api.html#2_nodes_set with reconnect_start and reconnect_delay determining the interval for checking the node connections. If you need help using cloudi_core, that is at https://github.com/CloudI/CloudI/tree/master/examples/hello_world5 .
On 04/22/2016 04:42 AM, Roberto Ostinelli wrote:
> Thank you both for confirming.
> Hence I guess some kind of a basic reconnection manager might be helpful here.
>
> Thanks!
>
> r.
>
> On Fri, Apr 22, 2016 at 12:35 PM, Serge Aleynikov <serge@REDACTED <mailto:serge@REDACTED>> wrote:
>
> Roberto,
>
> This is the expected behavior. Note that the nodes will automatically reconnect by default when either node has a process that sends a message to a process on the remote node. This reconnection behavior can be modified by setting the kernel's 'dist_auto_connect' option.
>
> Other applications (such as mnesia) may require custom recovery from a network split, which is one of the reasons why automatic reconnection may not be desirable.
>
> Regards,
>
> Serge
>
> On Fri, Apr 22, 2016 at 5:39 AM, Roberto Ostinelli <roberto@REDACTED <mailto:roberto@REDACTED>> wrote:
>
> Dear list,
> A simple question: am I correct that, when a node is removed because of a net split, you need to have your own application logic to reconnect to it, and nothing in the VM will try doing that for you?
>
> Let me show you an example. I have two nodes: 1@REDACTED <mailto:1@REDACTED> and 2@REDACTED <mailto:2@REDACTED> that are connected to each other:
>
> (1@REDACTED <mailto:1@REDACTED>)1> nodes().
> ['2@REDACTED <mailto:2@REDACTED>']
>
> On node 2 I listen for nodedown events of node 1:
>
> (2@REDACTED <mailto:2@REDACTED>)1> monitor_node('1@REDACTED <mailto:1@REDACTED>', true).
> true
>
> On node 1, I simulate a net splits with the best option I've found until now, i.e suspending the net_kernel process:
>
> (1@REDACTED <mailto:1@REDACTED>)2> sys:suspend(net_kernel).
> ok
>
> After ~60 seconds on node 2 I get:
>
> =ERROR REPORT==== 22-Apr-2016::11:28:21 ===
> ** Node '1@REDACTED <mailto:1@REDACTED>' not responding **
> ** Removing (timedout) connection **
> (2@REDACTED <mailto:2@REDACTED>)2> flush().
> Shell got {nodedown,'1@REDACTED <mailto:1@REDACTED>'}
>
> Now the two nodes are disconnected:
>
> (1@REDACTED <mailto:1@REDACTED>)3> nodes().
> []
>
> (2@REDACTED <mailto:2@REDACTED>)3> nodes().
> []
>
> Even when I resume the net_kernel process:
>
> (1@REDACTED <mailto:1@REDACTED>)4> sys:resume(net_kernel).
> ok
>
> The nodes do not reconnect:
>
> (1@REDACTED <mailto:1@REDACTED>)5> nodes().
> []
>
> I'm ok with this, though I would like to confirm that my understanding is correct.
> If so, does everyone just implement some standard connection manager that does only reconnections?
>
> Thank you,
> r.
>
>
>
>
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED <mailto:erlang-questions@REDACTED>
> http://erlang.org/mailman/listinfo/erlang-questions
>
>
>
>
>
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://erlang.org/mailman/listinfo/erlang-questions
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20160422/c8431b3d/attachment.htm>
More information about the erlang-questions
mailing list