distributed Erlang connect fails
Fredrik Thulin
ft@REDACTED
Wed Jun 1 11:08:15 CEST 2005
On Tuesday 31 May 2005 16.47, Gerd Flaig wrote:
> Fredrik Thulin <ft@REDACTED> writes:
> > I'm writing a command-line control tool for my application. I get
> > into problems if I execute my control-tool rapidly (like pressing
> > up-arrow and then enter in the UNIX shell).
>
> you could try to assign a unique name to each control tool instance,
> like in
>
> $ erl -name control$$ -hidden -remsh incomingproxy@`hostname -f`
Yes, sure. I think this is a bigger problem though. I have seen this
problem many times before, when I stop one of my nodes and want to
restart it immediately.
I've given the question about what really is the problem some more
thought, and I think the problem is that the node that continues
running is not made aware of the other node stopping.
Sometimes, this seems to go fairly quick (so restart works), but
sometimes it seems to take the full 75 seconds (comment in
dist_util.erl : "The detection time interval is thus, by default, 45s <
DT < 75s") of not receiving an answer to ticks sent to the other node
before the running node discovers that the other node is gone.
With a seven second timeout for the node starting up, this obviously has
a (great) possibility to fail.
/Fredrik
More information about the erlang-questions
mailing list