<div dir="ltr">Hi!<br><br>We have a performance problem receiving lots of UDP traffic.<div>There are a lot (about 70) of UDP receive processes, each handling about 1 to 10 megabits of multicast traffic, with {active, N}.<br><br>msacc summary on my OSX laptop, build from OTP master c30309e799212b080c39ee2f91af3f9a0383d767 (Apr 19):<div><br></div><div><pre style="font-family:Consolas,Menlo,"Liberation Mono",Courier,monospace;margin:1em 1em 1em 1.6em;padding:8px;background-color:rgb(250,250,250);border:1px solid rgb(226,226,226);border-radius:3px;width:auto;overflow-x:auto;overflow-y:hidden;color:rgb(51,51,51);font-size:12px;font-style:normal;font-variant-ligatures:normal;font-variant-caps:normal;font-weight:400;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;word-spacing:0px;text-decoration-style:initial;text-decoration-color:initial"><br class="gmail-Apple-interchange-newline">        Thread    alloc      aux      bifbusy_wait check_io emulator      ets       gc  gc_full      nif    other     port     send    sleep   timers
     scheduler   30.02%    0.92%    2.86%   24.66%    0.01%    9.61%    0.03%    1.25%    0.20%    0.13%    2.34%    9.33%    0.41%   17.78%    0.44%
</pre><br></div><div><div>Linux production server behaves the same way (we do not have extended msacc there yet, so most of alloc goes to port).</div><div><br></div><div>perf top (on Linux production) says there's a lot of unaligned memmove:</div><div><pre style="font-family:Consolas,Menlo,"Liberation Mono",Courier,monospace;margin:1em 1em 1em 1.6em;padding:8px;background-color:rgb(250,250,250);border:1px solid rgb(226,226,226);border-radius:3px;width:auto;overflow-x:auto;overflow-y:hidden;color:rgb(51,51,51);font-size:12px;font-style:normal;font-variant-ligatures:normal;font-variant-caps:normal;font-weight:400;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;word-spacing:0px;text-decoration-style:initial;text-decoration-color:initial">  69.76%  <a href="http://libc-2.24.so">libc-2.24.so</a>        [.] __memmove_sse2_unaligned_erms
   6.13%  beam.smp            [.] process_main
   2.02%  beam.smp            [.] erts_schedule
   0.87%  [kernel]            [k] copy_user_enhanced_fast_string
</pre><p style="color:rgb(51,51,51);font-family:Verdana,sans-serif;font-size:12px;font-style:normal;font-variant-ligatures:normal;font-variant-caps:normal;font-weight:400;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;background-color:rgb(255,255,255);text-decoration-style:initial;text-decoration-color:initial"></p><br class="gmail-Apple-interchange-newline">I'll try to make a minimal example for this.</div><div>Maybe there are simple recommendations on optimizing this kind of load?</div><div><br></div><div>-- <br>Danil Zagoskin | <a href="mailto:z@gosk.in">z@gosk.in</a></div></div></div></div>