<html><head></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space; "><div><br></div><div>It should be noted that there is an ongoing EU-funded research project which aims to address the limits of the current fully-connected approach.</div><div><br></div><div><a href="http://www.release-project.eu/">http://www.release-project.eu/</a></div><div><a href="http://www.release-project.eu/documents/RELEASEfactsheetv6.pdf">http://www.release-project.eu/documents/RELEASEfactsheetv6.pdf</a></div><div><br></div><div>The high-level goal is to figure out how to build clusters of up to 10^5 cores. We envision having to improve both SMP scalability (beyond 100 cores per node) and Distributed Erlang scalability (more than 1000 nodes), and offer powerful and fault-tolerant standard libraries for high productivity. One factor that speaks for Erlang in this realm is that MTTCF (mean time to core failure) for a cluster of that many cores, given current technology, could be less than an hour. This necessitates a solution that can handle partial failure.</div><div><br></div><div>It would be great if you could lean on those guys with experiences and ideas. Catch any one of e.g. Robert Virding, Tino Breddin, Kostis, Simon Thompson, Francesco, Olivier Boudeville, or Kenneth Lundin and buy them a beer or two - I'm sure you'll find them receptive. ;-)</div><div><br></div><div>Using hidden nodes and spinning your own cluster is one thing we discussed, as well as , but our feeling was that this is way too low-level for a respectable Erlang approach. I imagine some sort of routing paradigm integrated into OTP will be needed, if Distribution Transparency is to be preserved without heroic effort.</div><div><br></div><div>Regarding heart-beat timeouts, I think this is often due to port back-pressure. If the VM detects busy_dist_port, it will simply suspend the process trying to send, and when the dist_port queue falls back under a given threshold, it resumes all suspended processes at once. This can easily become disastrous in a situation with very high traffic or slow ports (e.g. in a virtual environment).</div><div><br></div><div>Things have improved much in this regard in later releases. For one thing, buffers and thresholds have been increased, are more easily tunable (some hacking required still, I think), and overall throughput on the dist port has been improved. There is still much that can be done, but most of the really embarrassing problems should now be a thing of the past. If not, please share your experiences.</div><div><br></div><div>BR,</div><div>Ulf W</div><br><div><div>On 15 Feb 2012, at 00:27, Richard O'Keefe wrote:</div><br class="Apple-interchange-newline"><blockquote type="cite"><div><br>On 15/02/2012, at 4:49 AM, Garrett Smith wrote:<br><blockquote type="cite">On Sat, Feb 11, 2012 at 4:53 PM, Miles Fidelman<br></blockquote><blockquote type="cite"><<a href="mailto:mfidelman@meetinghouse.net">mfidelman@meetinghouse.net</a>> wrote:<br></blockquote><blockquote type="cite"><blockquote type="cite">Does anybody have experience running an Erlang environment consisting of<br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite">100s, 1000s, or more nodes, spread across a network?<br></blockquote></blockquote><blockquote type="cite">As has been said, the fully connected mesh practically limits you to<br></blockquote><blockquote type="cite">50 - 150 nodes, at least in my experience. You will also find that<br></blockquote><blockquote type="cite">connections flap quite a bit across unreliable networks, wreaking<br></blockquote><blockquote type="cite">havoc in your application if you don't design for it.<br></blockquote><br>What would happen if you had 1000 nodes in a box with a reliable but<br>not ultrafast interconnect? I'm not talking about multicore here,<br>although 16 Tileras in a smallish box doesn't seem unlikely any more,<br>but say 1000 separate-physical-address-space nodes connected as a<br>tree or a hypercube or something.<br><br>Could distributed Erlang be set up in some hierarchical fashion?<br><br>It seems to me that there are three issues:<br> - number of points of authentication<br><span class="Apple-tab-span" style="white-space:pre"> </span>(network: many; cluster-in-a-box: one)<br> - number of eavesdropping points<br><span class="Apple-tab-span" style="white-space:pre"> </span>(network: many; cluster-in-a-box: one)<br> - number of communicating devices<br><span class="Apple-tab-span" style="white-space:pre"> </span>(network: many; cluster-in-a-box: many)<br>and that just thinking in terms of authentication and eavesdropping,<br>distributed Erlang makes perfect sense for cluster-in-a-box,<br>IF it works at that scale.<br><br>The Magnus project that Fergus O'Brien was involved with would have<br>been using Erlang in this way, I believe.<br>_______________________________________________<br>erlang-questions mailing list<br><a href="mailto:erlang-questions@erlang.org">erlang-questions@erlang.org</a><br>http://erlang.org/mailman/listinfo/erlang-questions<br></div></blockquote></div><br></body></html>