<div dir="ltr">If you check out <a name="set_net_ticktime-1"><span class="code">net_ticktime in the kernel_app docs, (you can set it with net_kernel:set_</span><span class="code">net_ticktime/1,2</span><span class="code">), you'll see:<br>

<br></span>"Once every <span class="code">TickTime/4</span> second, all

connected nodes are ticked (if anything else has been written

to a node) and if nothing has been received from another node

within the last four (4) tick times that node is considered

to be down..."<br><br>The default ticktime is 60s, meaning a ping every 15 seconds.<br></a><br><div class="gmail_quote">On Sat, Aug 16, 2008 at 5:24 PM, Serge Aleynikov <span dir="ltr"><<a href="mailto:saleyn@gmail.com">saleyn@gmail.com</a>></span> wrote:<br>

<blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">I suppose that the problem with the max number of sockets is solved by<br>

tweaking session limits (ulimit) and using kernel poll (+K true).<br>

<br>

As I understand, in a 600 node cluster every node will maintain<br>

connections to the rest 599 nodes, and send periodic pings.  So, that<br>

pinging overhead would be something in the order of 10 events per second<br>

  per node in this configuration.  While the number doesn't seem<br>

intimidating I wonder if that overhead becomes noticeable in large<br>

network configurations and if there are any other guidelines that help<br>

architect such large network clusters to keep background load minimal.<br>

<font color="#888888"><br>

Serge<br>

</font><div><div></div><div class="Wj3C7c"><br>

Viktor Sovietov wrote:<br>

> Hi Serge<br>

><br>

> As far as I know you're only limited with the maximum number of sockets<br>

> which are available on your system and with number of atoms which can be<br>

> used as node names.<br>

> We tested 600 nodes cluster, but I honestly can't recall if there were any<br>

> patches to BEAM to increase mentioned parameters.<br>

><br>

> Sincerely,<br>

><br>

> --Viktor<br>

><br>

><br>

> Serge Aleynikov-2 wrote:<br>

>> Does any one have experience running somewhere between 200 and 400 nodes<br>

>> in production?  I recall that Erlang distributed layer had a limit of<br>

>> 256 nodes.  Is it still the case?<br>

>><br>

>> I suppose that partitioning the cluster in several global_groups should<br>

>> limit the network load and the number of open file descriptors on each<br>

>> node would be reduced.<br>

>><br>

>> Are there any other concerns one should be aware of when working with<br>

>> such large clusters.<br>

>><br>

>> Serge<br>

>> _______________________________________________<br>

>> erlang-questions mailing list<br>

>> <a href="mailto:erlang-questions@erlang.org">erlang-questions@erlang.org</a><br>

>> <a href="http://www.erlang.org/mailman/listinfo/erlang-questions" target="_blank">http://www.erlang.org/mailman/listinfo/erlang-questions</a><br>

>><br>

>><br>

><br>

<br>

_______________________________________________<br>

erlang-questions mailing list<br>

<a href="mailto:erlang-questions@erlang.org">erlang-questions@erlang.org</a><br>

<a href="http://www.erlang.org/mailman/listinfo/erlang-questions" target="_blank">http://www.erlang.org/mailman/listinfo/erlang-questions</a><br>

</div></div></blockquote></div><br></div>