<div dir="ltr">If you check out <a name="set_net_ticktime-1"><span class="code">net_ticktime in the kernel_app docs, (you can set it with net_kernel:set_</span><span class="code">net_ticktime/1,2</span><span class="code">), you'll see:<br>
<br></span>"Once every <span class="code">TickTime/4</span> second, all
connected nodes are ticked (if anything else has been written
to a node) and if nothing has been received from another node
within the last four (4) tick times that node is considered
to be down..."<br><br>The default ticktime is 60s, meaning a ping every 15 seconds.<br></a><br><div class="gmail_quote">On Sat, Aug 16, 2008 at 5:24 PM, Serge Aleynikov <span dir="ltr"><<a href="mailto:saleyn@gmail.com">saleyn@gmail.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">I suppose that the problem with the max number of sockets is solved by<br>
tweaking session limits (ulimit) and using kernel poll (+K true).<br>
<br>
As I understand, in a 600 node cluster every node will maintain<br>
connections to the rest 599 nodes, and send periodic pings. So, that<br>
pinging overhead would be something in the order of 10 events per second<br>
per node in this configuration. While the number doesn't seem<br>
intimidating I wonder if that overhead becomes noticeable in large<br>
network configurations and if there are any other guidelines that help<br>
architect such large network clusters to keep background load minimal.<br>
<font color="#888888"><br>
Serge<br>
</font><div><div></div><div class="Wj3C7c"><br>
Viktor Sovietov wrote:<br>
> Hi Serge<br>
><br>
> As far as I know you're only limited with the maximum number of sockets<br>
> which are available on your system and with number of atoms which can be<br>
> used as node names.<br>
> We tested 600 nodes cluster, but I honestly can't recall if there were any<br>
> patches to BEAM to increase mentioned parameters.<br>
><br>
> Sincerely,<br>
><br>
> --Viktor<br>
><br>
><br>
> Serge Aleynikov-2 wrote:<br>
>> Does any one have experience running somewhere between 200 and 400 nodes<br>
>> in production? I recall that Erlang distributed layer had a limit of<br>
>> 256 nodes. Is it still the case?<br>
>><br>
>> I suppose that partitioning the cluster in several global_groups should<br>
>> limit the network load and the number of open file descriptors on each<br>
>> node would be reduced.<br>
>><br>
>> Are there any other concerns one should be aware of when working with<br>
>> such large clusters.<br>
>><br>
>> Serge<br>
>> _______________________________________________<br>
>> erlang-questions mailing list<br>
>> <a href="mailto:erlang-questions@erlang.org">erlang-questions@erlang.org</a><br>
>> <a href="http://www.erlang.org/mailman/listinfo/erlang-questions" target="_blank">http://www.erlang.org/mailman/listinfo/erlang-questions</a><br>
>><br>
>><br>
><br>
<br>
_______________________________________________<br>
erlang-questions mailing list<br>
<a href="mailto:erlang-questions@erlang.org">erlang-questions@erlang.org</a><br>
<a href="http://www.erlang.org/mailman/listinfo/erlang-questions" target="_blank">http://www.erlang.org/mailman/listinfo/erlang-questions</a><br>
</div></div></blockquote></div><br></div>