<div dir="ltr"><div dir="ltr"><div class="gmail_default" style="font-family:arial,helvetica,sans-serif"><span style="font-family:Arial,Helvetica,sans-serif">On Tue, Mar 19, 2019 at 1:35 PM Borja de Regil <<a href="mailto:borja.deregil@imdea.org">borja.deregil@imdea.org</a>> wrote:</span><br></div></div><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><br>

Apart from the increase in latency between sites, no other configuration is changed. My initial expectation was that the throughput would stay the same, even if the base latency would increase, and that the saturation point would be reached at approximately the same number of requests per second.<br>

<br></blockquote><div><br></div><div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif">In your scenarios, this assumption needs amendment. Your client code sends a single message and then waits for a reply, so at most one message is on the line at a given time, and you have implemented a stop and go protocol on top of TCP. In such a protocol, an increase in network delay/latency incurs a loss of bandwidth. Your A and B experiments uses an RTT of 0.25ms and 10ms respectively. This is a 40 times increase in latency, and it will affect the bandwidth between the peers. In this case it puts an artificial limit on how much data you can transfer. You keep your messages at 1 kilobyte, so the req/s is essentially a bandwidth measurement of bytes/sec. The problem is the same as when you are doing e.g., satellite communications: geosynchronous orbit is around 550ms away in practice. It is also linked to the so-called bandwidth*delay product (BDP).</div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif"><br></div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif">Some napkin math: Suppose you have 1 connection. You have a lower RTT of 10ms. At most, this is 100 req/s on that connection. Suppose we have 500 of those connections. Then the maximal req/s is 500*100 = 50,000. </div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif"><br></div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif">A way I often approach these questions is by creating an extreme scenario: I have 1 connection and 3 seconds of latency. What happens? The goal is to identify the shadow constraints of the system so you can understand to where the bottleneck moves once the apparent first bottleneck is found and eliminated.</div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif"><br></div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif">The other problem is that your load generator coordinates which leads to coordinated omission[0]. The load generator only issues the next request once the previous one completes. It is usually better to keep the bandwidth usage contant and then measure latency, counting a late request against the system.</div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif"><br></div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif">The astute reader will also observe you measure the mean latency. This is not very useful, and you should look at either a boxplot, kernel density plot, histogram. or the like. If you know the data is normally distributed with the same standard deviation, then your average latency makes sense as a comparable value. But this requires you plot the data, look at it and make sure they have that shape. Otherwise you can be led astray. As an example suppose I have 2 fair dice. One die has faces 1,2,3,4,5,6. The other has faces 1,1,1,6,6,6. These two dice have the same mean (3.5), but you would not argue they are the same in distribution. In fact, the latter die has no observation close to 3.5 ever!</div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif"><br></div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif">Now to solutions:</div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif"><br></div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif">The problem has to do with physics, more-so than Erlang. Information travels as a wave in a medium such as copper wire or fiber. This speed has an upper limit, which is the speed of light as information cannot travel faster than that. In practice, fiber is roughly 2/3 of light speed, and you can assume that relatively constant. You need to employ latency hiding tricks to circumvent this limit.</div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif"><br></div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif">Batching is an option. Collect multiple requests and send them all off at the same time.This effectively makes sure you can have multiple requests inflight, which gets you around the delay constant. It also allows a smart server to process multiple requests simultaneously, thus shedding load. Microbatching is alluring here: when the first request arrives, set a cork for 5ms. Upon having read either 500 reqs or the timer triggering, process whatever you have in the batch. Then start all over. This is used by e.g., Kafka and TensorFlow, the latter to avoid the memory bottleneck between DRAM and GPU.</div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif"><br></div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif">Pipelining is another option. Send multiple requests back to back, so they are all in the network. Interlace your receive loop with new requests being sent, so you can keep the work going. You should consider tagging messages with unique identifiers so they can be told apart. This allows out-of-order processing. See plan9's 9p protocol, RabbitMQ/AMQP's correlation IDs, or HTTP/2. Quick implementation: Loic Hoguin's Cowboy/Gun combo works wonders here, and uses the same code base (cowlib). This will avoid the wait time effectively. </div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif"><br></div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif">[0] See e.g., Gil Tene's work on this.</div><br></div></div></div>