[erlang-questions] UDP receive performance
Jonas Falkevik
jonas.falkevik@REDACTED
Wed Jun 20 13:56:27 CEST 2018
On Mon, Jun 18, 2018 at 3:57 PM, Lukas Larsson <lukas@REDACTED> wrote:
>
>
>>> Something like https://github.com/falkevik/ot
>> p/commit/444fb00ff2a9d1f40a8c66f48bea1cf3f07ca86c ?
>>
>
> I was thinking something more like this: https://github.com/garazdawi/o
> tp/tree/lukas/erts/realloc_thr_pref_shrink_th
>
> This makes it so that the absolute and relative single block shrink
> thesholds are respected for reallocs made on a remote scheduler. This
> should solve the problem that you have found also, through as I still
> haven't reproduced it I can't test that it actually solves it.
>
>
Are you adding the multicast network to the loop back interface? Using some
other interface that does not allow multicast traffic?
I have been able to reproduce it in macOS Sierra (Darwin Kernel Version
16.7.0: Fri Apr 27 17:59:46 PDT 2018; root:xnu-3789.73.13~1/RELEASE_X86_64
x86_64)
and also on Ubuntu Linux 16.04 LTS with kernel 4.4.0-112-generic.
Danil Zagoskin, do you have time to try and see if there is any change in
the behaviour for you with a patched system?
With the bugfix, the beam is spending most of its time in sched_spin_wait
on linux and sched_yield on macOS.
Stats below are from OTP 21 and then OTP21 with bugfix on linux
Erlang/OTP 21 [erts-10.0] [source] [64-bit] [smp:8:8] [ds:8:8:10]
[async-threads:1] [hipe]
Eshell V10.0 (abort with ^G)
1> udptest:start_sender({239,9,9,9}, 3999).
<0.78.0>
2> [udptest:start_reader({239,9,9,9}, 3999) || _ <- lists:seq(1, 40)].
[<0.80.0>,<0.81.0>,<0.82.0>,<0.83.0>,<0.84.0>,<0.85.0>,
<0.86.0>,<0.87.0>,<0.88.0>,<0.89.0>,<0.90.0>,<0.91.0>,
<0.92.0>,<0.93.0>,<0.94.0>,<0.95.0>,<0.96.0>,<0.97.0>,
<0.98.0>,<0.99.0>,<0.100.0>,<0.101.0>,<0.102.0>,<0.103.0>,
<0.104.0>,<0.105.0>,<0.106.0>,<0.107.0>,<0.108.0>|...]
3> msacc:start(10000), msacc:print().
Average thread real-time : 10002138 us
Accumulated system run-time : 62576106 us
Average scheduler run-time : 7775309 us
Thread aux check_io emulator gc other port
sleep
Stats per thread:
async( 0) 0.00% 0.00% 0.00% 0.00% 0.00% 0.00%
100.00%
aux( 1) 0.06% 0.96% 0.00% 0.00% 0.07% 0.00%
98.90%
dirty_cpu_( 1) 0.00% 0.00% 0.00% 0.00% 0.00% 0.00%
100.00%
dirty_cpu_( 2) 0.00% 0.00% 0.00% 0.00% 0.00% 0.00%
100.00%
dirty_cpu_( 3) 0.00% 0.00% 0.00% 0.00% 0.00% 0.00%
100.00%
dirty_cpu_( 4) 0.00% 0.00% 0.00% 0.00% 0.00% 0.00%
100.00%
dirty_cpu_( 5) 0.00% 0.00% 0.00% 0.00% 0.00% 0.00%
100.00%
dirty_cpu_( 6) 0.00% 0.00% 0.00% 0.00% 0.00% 0.00%
100.00%
dirty_cpu_( 7) 0.00% 0.00% 0.00% 0.00% 0.00% 0.00%
100.00%
dirty_cpu_( 8) 0.00% 0.00% 0.00% 0.00% 0.00% 0.00%
100.00%
dirty_io_s( 1) 0.00% 0.00% 0.00% 0.00% 0.00% 0.00%
100.00%
dirty_io_s( 2) 0.00% 0.00% 0.00% 0.00% 0.00% 0.00%
100.00%
dirty_io_s( 3) 0.00% 0.00% 0.00% 0.00% 0.00% 0.00%
100.00%
dirty_io_s( 4) 0.00% 0.00% 0.00% 0.00% 0.00% 0.00%
100.00%
dirty_io_s( 5) 0.00% 0.00% 0.00% 0.00% 0.00% 0.00%
100.00%
dirty_io_s( 6) 0.00% 0.00% 0.00% 0.00% 0.00% 0.00%
100.00%
dirty_io_s( 7) 0.00% 0.00% 0.00% 0.00% 0.00% 0.00%
100.00%
dirty_io_s( 8) 0.00% 0.00% 0.00% 0.00% 0.00% 0.00%
100.00%
dirty_io_s( 9) 0.00% 0.00% 0.00% 0.00% 0.00% 0.00%
100.00%
dirty_io_s(10) 0.00% 0.00% 0.00% 0.00% 0.00% 0.00%
100.00%
poll( 0) 0.00% 2.63% 0.00% 0.00% 0.00% 0.00%
97.37%
scheduler( 1) 0.82% 0.00% 0.52% 0.35% 7.33% 67.91%
23.08%
scheduler( 2) 0.76% 0.00% 0.79% 0.34% 7.30% 68.61%
22.21%
scheduler( 3) 0.74% 0.00% 0.83% 0.33% 7.17% 68.54%
22.38%
scheduler( 4) 0.76% 0.00% 0.72% 0.34% 7.38% 68.84%
21.96%
scheduler( 5) 0.79% 0.00% 0.82% 0.34% 7.21% 68.94%
21.90%
scheduler( 6) 0.80% 0.00% 0.74% 0.31% 7.15% 68.56%
22.44%
scheduler( 7) 0.87% 0.00% 0.72% 0.37% 7.24% 68.43%
22.36%
scheduler( 8) 0.77% 0.00% 0.68% 0.35% 7.41% 69.02%
21.77%
Stats per type:
async 0.00% 0.00% 0.00% 0.00% 0.00% 0.00%
100.00%
aux 0.06% 0.96% 0.00% 0.00% 0.07% 0.00%
98.90%
dirty_cpu_sche 0.00% 0.00% 0.00% 0.00% 0.00% 0.00%
100.00%
dirty_io_sched 0.00% 0.00% 0.00% 0.00% 0.00% 0.00%
100.00%
poll 0.00% 2.63% 0.00% 0.00% 0.00% 0.00%
97.37%
scheduler 0.79% 0.00% 0.73% 0.34% 7.27% 68.61%
22.26%
ok
4>
$ ../src/udp_performace_bug_fix/bin/erl
Erlang/OTP 21 [RELEASE CANDIDATE 2] [erts-9.3.2] [source] [64-bit]
[smp:8:8] [ds:8:8:10] [async-threads:1] [hipe]
Eshell V9.3.2 (abort with ^G)
1> udptest:start_sender({239,9,9,9}, 3999).
<0.78.0>
2> [udptest:start_reader({239,9,9,9}, 3999) || _ <- lists:seq(1, 40)].
[<0.80.0>,<0.81.0>,<0.82.0>,<0.83.0>,<0.84.0>,<0.85.0>,
<0.86.0>,<0.87.0>,<0.88.0>,<0.89.0>,<0.90.0>,<0.91.0>,
<0.92.0>,<0.93.0>,<0.94.0>,<0.95.0>,<0.96.0>,<0.97.0>,
<0.98.0>,<0.99.0>,<0.100.0>,<0.101.0>,<0.102.0>,<0.103.0>,
<0.104.0>,<0.105.0>,<0.106.0>,<0.107.0>,<0.108.0>|...]
3> msacc:start(10000), msacc:print().
Average thread real-time : 10000416 us
Accumulated system run-time : 19541445 us
Average scheduler run-time : 2279822 us
Thread aux check_io emulator gc other port
sleep
Stats per thread:
async( 0) 0.00% 0.00% 0.00% 0.00% 0.00% 0.00%
100.00%
aux( 1) 0.00% 10.72% 0.00% 0.00% 0.00% 0.00%
89.28%
dirty_cpu_( 1) 0.00% 0.00% 0.00% 0.00% 0.00% 0.00%
100.00%
dirty_cpu_( 2) 0.00% 0.00% 0.00% 0.00% 0.00% 0.00%
100.00%
dirty_cpu_( 3) 0.00% 0.00% 0.00% 0.00% 0.00% 0.00%
100.00%
dirty_cpu_( 4) 0.00% 0.00% 0.00% 0.00% 0.00% 0.00%
100.00%
dirty_cpu_( 5) 0.00% 0.00% 0.00% 0.00% 0.00% 0.00%
100.00%
dirty_cpu_( 6) 0.00% 0.00% 0.00% 0.00% 0.00% 0.00%
100.00%
dirty_cpu_( 7) 0.00% 0.00% 0.00% 0.00% 0.00% 0.00%
100.00%
dirty_cpu_( 8) 0.00% 0.00% 0.00% 0.00% 0.00% 0.00%
100.00%
dirty_io_s( 1) 0.00% 0.00% 0.00% 0.00% 0.00% 0.00%
100.00%
dirty_io_s( 2) 0.00% 0.00% 0.00% 0.00% 0.00% 0.00%
100.00%
dirty_io_s( 3) 0.00% 0.00% 0.00% 0.00% 0.00% 0.00%
100.00%
dirty_io_s( 4) 0.00% 0.00% 0.00% 0.00% 0.00% 0.00%
100.00%
dirty_io_s( 5) 0.00% 0.00% 0.00% 0.00% 0.00% 0.00%
100.00%
dirty_io_s( 6) 0.00% 0.00% 0.00% 0.00% 0.00% 0.00%
100.00%
dirty_io_s( 7) 0.00% 0.00% 0.00% 0.00% 0.00% 0.00%
100.00%
dirty_io_s( 8) 0.00% 0.00% 0.00% 0.00% 0.00% 0.00%
100.00%
dirty_io_s( 9) 0.00% 0.00% 0.00% 0.00% 0.00% 0.00%
100.00%
dirty_io_s(10) 0.00% 0.00% 0.00% 0.00% 0.00% 0.00%
100.00%
poll( 0) 0.00% 2.31% 0.00% 0.00% 0.00% 0.00%
97.69%
scheduler( 1) 0.46% 0.00% 0.43% 0.14% 17.57% 4.07%
77.33%
scheduler( 2) 0.50% 0.00% 0.44% 0.15% 17.13% 4.21%
77.57%
scheduler( 3) 0.49% 0.00% 0.47% 0.16% 17.07% 4.35%
77.46%
scheduler( 4) 0.45% 0.00% 0.42% 0.14% 16.25% 4.02%
78.72%
scheduler( 5) 0.48% 0.00% 0.42% 0.14% 17.04% 4.04%
77.88%
scheduler( 6) 0.46% 0.00% 0.39% 0.14% 16.57% 3.77%
78.67%
scheduler( 7) 0.53% 0.00% 1.88% 0.27% 17.60% 8.22%
71.50%
scheduler( 8) 0.47% 0.00% 0.42% 0.15% 16.60% 3.86%
78.50%
Stats per type:
async 0.00% 0.00% 0.00% 0.00% 0.00% 0.00%
100.00%
aux 0.00% 10.72% 0.00% 0.00% 0.00% 0.00%
89.28%
dirty_cpu_sche 0.00% 0.00% 0.00% 0.00% 0.00% 0.00%
100.00%
dirty_io_sched 0.00% 0.00% 0.00% 0.00% 0.00% 0.00%
100.00%
poll 0.00% 2.31% 0.00% 0.00% 0.00% 0.00%
97.69%
scheduler 0.48% 0.00% 0.61% 0.16% 16.98% 4.57%
77.20%
ok
Jonas
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20180620/1c5efedb/attachment.htm>
More information about the erlang-questions
mailing list