<div dir="ltr">This difference (batching improves performance) is interesting to me because it reminds me of a problem we had where many small writes to gen_tcp slogged performance, whereas batching them to something closer to MTU or TCP buffer size greatly improved throughput. Calling into the driver added ~30ms when sending tiny messages, even with Nagle off.<div><br></div><div>I'm curious, did you try gen_udp with the {active, N} option, or were they already running in {active, true} mode?</div></div><div class="gmail_extra"><br><div class="gmail_quote">On Wed, Feb 3, 2016 at 10:23 AM, Max Lapshin <span dir="ltr"><<a href="mailto:max.lapshin@gmail.com" target="_blank">max.lapshin@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra">When we use plain gen_udp to accept 200-400 mbit of incoming MPEG-TS traffic in flussonic, we use about 50% of moderate 4 core xeon e3 server.</div><div class="gmail_extra"><br></div><div class="gmail_extra">When we switch to our driver implementation of udp that collapses several contiguous udp messages into single big message (it is allowed for mpegts) we reduce usage to 15-20%</div><div class="gmail_extra"><br></div><div class="gmail_extra">I can't tell that it is "badly written udp in erlang", just messaging is rather expensive.</div></div>

</blockquote></div><br></div>