[erlang-questions] large scale deployments and netsplits
Tue Sep 15 10:41:12 CEST 2009
Bengt Tillman wrote:
> We have had to set the net ticktime to 300 in order to keep the Erlang
> nodes from losing contact with each other. The response times between
> different Erlang nodes is not mission critical in our application ...
I will admit that I have meditated over the network
tick algorithm in Erlang several times, without being any
wiser for it. It's a very nice piece of code, but I can't
help thinking that there is some fatal flaw buried deep
At AXD 301, we tried reducing the detection times as much
as we could, but never could get below a net_ticktime of 10
without getting lots of false positives. In contrast, our
own device processor supervision had shorter detection times
(5-6 seconds, if memory serves) and practically never any false
positives, using the very same communication network.
The code wasn't nearly as elegant, though. :)
To be fair, this was on an internal ATM network, so we could
be fairly sure that the internal communication paths were
never starved by other traffic. This is of course not true
in general for TCP/IP networks.
CTO, Erlang Training & Consulting Ltd
More information about the erlang-questions