<div dir="ltr">Thanks for the heads up Lukas! Sorry I stopped responding, we ended up solving the problem (for now) by drastically cutting down on inter-node communication in another way, and this thread got lost in my inbox, but I really appreciate the follow-up!</div>


<div class="gmail_extra"><br><br><div class="gmail_quote">On Thu, May 16, 2013 at 3:43 AM, Lukas Larsson <span dir="ltr"><<a href="mailto:lukas@erlang.org" target="_blank">lukas@erlang.org</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">


<div dir="ltr"><div>Hello Brian,<br><br></div>Just letting you know that I have just merged a fix which changes the rwlock I mentioned before to an rlock. This should reduce the contention which you are seeing if it was caused by many calls to erlang:nodes().<span><font color="#888888"><br>


<br>Lukas<br></font></span></div><div class="gmail_extra"><br><br><div class="gmail_quote"><div><div>On Tue, Apr 23, 2013 at 9:01 PM, Scott Lystig Fritchie <span dir="ltr"><<a href="mailto:fritchie@snookles.com" target="_blank">fritchie@snookles.com</a>></span> wrote:<br>


</div></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div><div>Brian Picciano <<a href="mailto:mediocregopher@gmail.com" target="_blank">mediocregopher@gmail.com</a>> wrote:<br>


<br>

bp> We have a pool of 3 erlang nodes, all on different servers. Every<br>

bp> afternoon, without fail, we start seeing lots of messages between<br>

bp> the nodes start having really high latency, on the order of tens of<br>

bp> seconds. [...]<br>

<br>

Brian, it's probably worthwhile to continue chasing the 'lcnt' avenue<br>

as you've been corresponding with Lukas...<br>

<br>

... but at the same time, I also wonder about "tens of seconds".  My gut<br>

says that such delays would require some amazingly high lock contention<br>

rates.  Something that can cause such messaging delays much more easily<br>

is network congestion/packet loss that triggers TCP slow start.  Many<br>

Linux kernels have the RTO_min value at one second, which is the amount<br>

of time to wait before entering slow start state.<br>

<br>

If network packet loss is a problem, this blog posting can explain one<br>

reason why it's happening:<br>

<a href="http://www.snookles.com/slf-blog/2012/01/05/tcp-incast-what-is-it/" target="_blank">http://www.snookles.com/slf-blog/2012/01/05/tcp-incast-what-is-it/</a><br>

<span><font color="#888888"><br>

-Scott<br>

</font></span></div></div><div><div><div>_______________________________________________<br>

erlang-questions mailing list<br>

<a href="mailto:erlang-questions@erlang.org" target="_blank">erlang-questions@erlang.org</a><br>

<a href="http://erlang.org/mailman/listinfo/erlang-questions" target="_blank">http://erlang.org/mailman/listinfo/erlang-questions</a><br>

</div></div></div></blockquote></div><br></div>

</blockquote></div><br></div>