<div dir="ltr">Thanks for the heads up Lukas! Sorry I stopped responding, we ended up solving the problem (for now) by drastically cutting down on inter-node communication in another way, and this thread got lost in my inbox, but I really appreciate the follow-up!</div>
<div class="gmail_extra"><br><br><div class="gmail_quote">On Thu, May 16, 2013 at 3:43 AM, Lukas Larsson <span dir="ltr"><<a href="mailto:lukas@erlang.org" target="_blank">lukas@erlang.org</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div dir="ltr"><div>Hello Brian,<br><br></div>Just letting you know that I have just merged a fix which changes the rwlock I mentioned before to an rlock. This should reduce the contention which you are seeing if it was caused by many calls to erlang:nodes().<span><font color="#888888"><br>
<br>Lukas<br></font></span></div><div class="gmail_extra"><br><br><div class="gmail_quote"><div><div>On Tue, Apr 23, 2013 at 9:01 PM, Scott Lystig Fritchie <span dir="ltr"><<a href="mailto:fritchie@snookles.com" target="_blank">fritchie@snookles.com</a>></span> wrote:<br>
</div></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div><div>Brian Picciano <<a href="mailto:mediocregopher@gmail.com" target="_blank">mediocregopher@gmail.com</a>> wrote:<br>
<br>
bp> We have a pool of 3 erlang nodes, all on different servers. Every<br>
bp> afternoon, without fail, we start seeing lots of messages between<br>
bp> the nodes start having really high latency, on the order of tens of<br>
bp> seconds. [...]<br>
<br>
Brian, it's probably worthwhile to continue chasing the 'lcnt' avenue<br>
as you've been corresponding with Lukas...<br>
<br>
... but at the same time, I also wonder about "tens of seconds". My gut<br>
says that such delays would require some amazingly high lock contention<br>
rates. Something that can cause such messaging delays much more easily<br>
is network congestion/packet loss that triggers TCP slow start. Many<br>
Linux kernels have the RTO_min value at one second, which is the amount<br>
of time to wait before entering slow start state.<br>
<br>
If network packet loss is a problem, this blog posting can explain one<br>
reason why it's happening:<br>
<a href="http://www.snookles.com/slf-blog/2012/01/05/tcp-incast-what-is-it/" target="_blank">http://www.snookles.com/slf-blog/2012/01/05/tcp-incast-what-is-it/</a><br>
<span><font color="#888888"><br>
-Scott<br>
</font></span></div></div><div><div><div>_______________________________________________<br>
erlang-questions mailing list<br>
<a href="mailto:erlang-questions@erlang.org" target="_blank">erlang-questions@erlang.org</a><br>
<a href="http://erlang.org/mailman/listinfo/erlang-questions" target="_blank">http://erlang.org/mailman/listinfo/erlang-questions</a><br>
</div></div></div></blockquote></div><br></div>
</blockquote></div><br></div>