<!DOCTYPE html><html><head><title></title><style type="text/css">p.MsoNormal,p.MsoNoSpacing{margin:0}</style></head><body><div>Yea, instrumentation from the beginning is a good bet. Shameless plug <a href="https://opencensus.io/quickstart/erlang/">https://opencensus.io/quickstart/erlang/</a> :) -- and prometheus.erl for vm metrics like Jesper suggests.<br></div><div><br></div><div>Tristan<br></div><div><br></div><div>On Fri, Apr 12, 2019, at 03:53, Jesper Louis Andersen wrote:<br></div><blockquote type="cite" id="qt"><div dir="ltr"><div style="font-family:arial, helvetica, sans-serif;" class="qt-gmail_default">My first recommendation is to add instrumentation to the system, so you can see what is going on:<br></div><div style="font-family:arial, helvetica, sans-serif;" class="qt-gmail_default"><br></div><div style="font-family:arial, helvetica, sans-serif;" class="qt-gmail_default">* Tristan already suggested looking at mailbox sizes<br></div><div style="font-family:arial, helvetica, sans-serif;" class="qt-gmail_default">* Network blocking is worth investigating as well. Many small messages can lead to network overload situations<br></div><div style="font-family:arial, helvetica, sans-serif;" class="qt-gmail_default">* Docker/Kubernetes environments tend to be noisy if a lot of work is running in them. In particular, if you have high-throughput systems banded with low latency systems, you are going to run into trouble.<br></div><div style="font-family:arial, helvetica, sans-serif;" class="qt-gmail_default">* Enable the Erlang system monitor. Get it to report on blocked ports and processes.<br></div><div style="font-family:arial, helvetica, sans-serif;" class="qt-gmail_default">* Add VM metrics: prometheus for instance.<br></div><div style="font-family:arial, helvetica, sans-serif;" class="qt-gmail_default"><br></div><div style="font-family:arial, helvetica, sans-serif;" class="qt-gmail_default">The problem can be everywhere: Inside your code, the VM, docker, kernel, hardware, ... Your first goal is to narrow down that. Verify things are looking correct in each layer before moving to the next.<br></div><div style="font-family:arial, helvetica, sans-serif;" class="qt-gmail_default"><br></div><div style="font-family:arial, helvetica, sans-serif;" class="qt-gmail_default">The fact latency starts out at 1 second where we are at millisecond level locally, would suggest something has to do with the distribution. Either in your own code, or in the underlying setup.<br></div></div><div><br></div><div class="qt-gmail_quote"><div class="qt-gmail_attr" dir="ltr">On Thu, Apr 11, 2019 at 9:07 PM Konstantinos Kallas <<a href="mailto:konstantinos.kallas@hotmail.com">konstantinos.kallas@hotmail.com</a>> wrote:<br></div><blockquote style="margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0.8ex;border-left-color:rgb(204, 204, 204);border-left-style:solid;border-left-width:1px;padding-left:1ex;" class="qt-gmail_quote"><div bgcolor="#FFFFFF"><p><span style="font-family:Ubuntu" class="font">Hello,</span><br></p><p><span style="font-family:Ubuntu" class="font">I have an Erlang application where latency is crucial and a lot of small messages (tuples with an atom and integer) are exchanged between processes in different nodes. </span><br></p><p><span style="font-family:Ubuntu" class="font">The main procedure is that a main process sends a small message to 4 worker processes in other Erlang nodes, the worker processes do some negligible processing, and then they reply back to the main node with a small message. </span><br></p><p><span style="font-family:Ubuntu" class="font">Each separate Erlang node is on a different docker container (generated from the erlang:21 docker image), and all the containers are connected using a standard docker bridge network.</span><br></p><p><span style="font-family:Ubuntu" class="font">I have noticed that latency (the time from when the first message is sent, and its replies arrive) linearly increases with time. It starts at 1 second and after 30 seconds of execution latency has become 10 seconds.</span><br></p><p><span style="font-family:Ubuntu" class="font">I have tried running all processes on the same erlang node, and then latency is (as expected) a couple milliseconds, so my assumption is that the problem could be caused by one (or more) of the following:</span><br></p><p><span style="font-family:Ubuntu" class="font">- Some misconfiguration of the Erlang nodes</span><br></p><p><span style="font-family:Ubuntu" class="font">- Some misconfiguration of the docker network/containers</span><br></p><p><span style="font-family:Ubuntu" class="font">- Some penalty imposed by the operating system/docker because a lot of small messages are exchanged</span><br></p><p><span style="font-family:Ubuntu" class="font">Has anyone encountered this issue, or does anyone know how to configure Erlang nodes (and the operating system) to reduce message latency? </span><br></p><p><span style="font-family:Ubuntu" class="font">Thanks in advance.</span><br></p><p><span style="font-family:Ubuntu" class="font">Best,</span><br></p><p><span style="font-family:Ubuntu" class="font">Konstantinos</span><br></p></div><div>_______________________________________________<br></div><div> erlang-questions mailing list<br></div><div> <a href="mailto:erlang-questions@erlang.org">erlang-questions@erlang.org</a><br></div><div> <a rel="noreferrer" href="http://erlang.org/mailman/listinfo/erlang-questions">http://erlang.org/mailman/listinfo/erlang-questions</a><br></div></blockquote></div><div><br></div><div><br></div><div>-- <br></div><div class="qt-gmail_signature" dir="ltr">J.<br></div><div>_______________________________________________<br></div><div>erlang-questions mailing list<br></div><div>erlang-questions@erlang.org<br></div><div>http://erlang.org/mailman/listinfo/erlang-questions<br></div><div><br></div></blockquote><div><br></div></body></html>