[erlang-questions] very large networks of erlang nodes?

Ulf Wiger ulf@REDACTED
Thu Feb 16 10:49:48 CET 2012


It should be noted that there is an ongoing EU-funded research project which aims to address the limits of the current fully-connected approach.

http://www.release-project.eu/
http://www.release-project.eu/documents/RELEASEfactsheetv6.pdf

The high-level goal is to figure out how to build clusters of up to 10^5 cores. We envision having to improve both SMP scalability (beyond 100 cores per node) and Distributed Erlang scalability (more than 1000 nodes), and offer powerful and fault-tolerant standard libraries for high productivity. One factor that speaks for Erlang in this realm is that MTTCF (mean time to core failure) for a cluster of that many cores, given current technology, could be less than an hour. This necessitates a solution that can handle partial failure.

It would be great if you could lean on those guys with experiences and ideas. Catch any one of e.g. Robert Virding, Tino Breddin, Kostis, Simon Thompson, Francesco, Olivier Boudeville, or Kenneth Lundin and buy them a beer or two - I'm sure you'll find them receptive. ;-)

Using hidden nodes and spinning your own cluster is one thing we discussed, as well as , but our feeling was that this is way too low-level for a respectable Erlang approach. I imagine some sort of routing paradigm integrated into OTP will be needed, if Distribution Transparency is to be preserved without heroic effort.

Regarding heart-beat timeouts, I think this is often due to port back-pressure. If the VM detects busy_dist_port, it will simply suspend the process trying to send, and when the dist_port queue falls back under a given threshold, it resumes all suspended processes at once. This can easily become disastrous in a situation with very high traffic or slow ports (e.g. in a virtual environment).

Things have improved much in this regard in later releases. For one thing, buffers and thresholds have been increased, are more easily tunable (some hacking required still, I think), and overall throughput on the dist port has been improved. There is still much that can be done, but most of the really embarrassing problems should now be a thing of the past. If not, please share your experiences.

BR,
Ulf W

On 15 Feb 2012, at 00:27, Richard O'Keefe wrote:

> 
> On 15/02/2012, at 4:49 AM, Garrett Smith wrote:
>> On Sat, Feb 11, 2012 at 4:53 PM, Miles Fidelman
>> <mfidelman@REDACTED> wrote:
>>> Does anybody have experience running an Erlang environment consisting of
>>> 100s, 1000s, or more nodes, spread across a network?
>> As has been said, the fully connected mesh practically limits you to
>> 50 - 150 nodes, at least in my experience. You will also find that
>> connections flap quite a bit across unreliable networks, wreaking
>> havoc in your application if you don't design for it.
> 
> What would happen if you had 1000 nodes in a box with a reliable but
> not ultrafast interconnect?  I'm not talking about multicore here,
> although 16 Tileras in a smallish box doesn't seem unlikely any more,
> but say 1000 separate-physical-address-space nodes connected as a
> tree or a hypercube or something.
> 
> Could distributed Erlang be set up in some hierarchical fashion?
> 
> It seems to me that there are three issues:
> - number of points of authentication
> 	(network: many; cluster-in-a-box: one)
> - number of eavesdropping points
> 	(network: many; cluster-in-a-box: one)
> - number of communicating devices
> 	(network: many; cluster-in-a-box: many)
> and that just thinking in terms of authentication and eavesdropping,
> distributed Erlang makes perfect sense for cluster-in-a-box,
> IF it works at that scale.
> 
> The Magnus project that Fergus O'Brien was involved with would have
> been using Erlang in this way, I believe.
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://erlang.org/mailman/listinfo/erlang-questions

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20120216/aa9d723b/attachment.htm>


More information about the erlang-questions mailing list