[erlang-questions] UDP receive performance
Jesper Louis Andersen
jesper.louis.andersen@REDACTED
Thu May 24 23:22:17 CEST 2018
I looked up the unaligned stuff. There are no aligned variant, and the
unaligned variant just sets up a prologue before entering the main loop
where you do have alignment. So I wouldn't worry about that, but more about
where the calls are being made and where the memory is copied around.
On Thu, May 24, 2018 at 5:35 PM Danil Zagoskin <z@REDACTED> wrote:
> - 2.57% 0.02% 31_scheduler beam.smp
> [.] process_main
> - 2.55% process_main
> - 2.43% erts_schedule
> - 2.26% erts_port_task_execute
> - 2.25% packet_inet_input.isra.31
> - 2.05% driver_realloc_binary
> - 2.05% realloc_thr_pref
> 1.87% __memmove_avx_unaligned_erms
>
> That's 40-core Xeon E5-2640, so 2.5% on a single scheduler is kind of 100%
> Also it's Linux kernel 4.9
>
>
> On a machine with kernel 4.13 and quad-core Xeon E31225 on a half of E5's
> load we have:
> - 16.11% 0.10% 1_scheduler beam.smp [.]
> erts_schedule
> - 16.01% erts_schedule
> - 13.62% erts_port_task_execute
> - 13.11% packet_inet_input.isra.31
> - 11.37% driver_realloc_binary
> - 11.33% realloc_thr_pref
> - 10.50% __memcpy_avx_unaligned
> 5.06% __memcpy_avx_unaligned
> + 1.04% page_fault
> 0.66% do_erts_alcu_realloc.constprop.31
> + 0.79% 0x108f3
> 0.55% driver_deliver_term
> 1.30% sched_spin_wait
>
> Seems like kernel version may change a lot, will run more tests.
>
> But it seems like memory operations are unaligned which could be not very
> efficient.
>
> On Thu, May 24, 2018 at 1:24 PM, Lukas Larsson <lukas@REDACTED> wrote:
>
>> Can you run perf with "--call-graph dwarf" and see which functions it is
>> that call memmove?
>>
>> On Thu, May 24, 2018 at 12:21 PM, Danil Zagoskin <z@REDACTED> wrote:
>>
>>> Yes, I've built a fresh master today (Erlang/OTP 21 [RELEASE CANDIDATE
>>> 1] [erts-9.3.1]), and nothing has changed.
>>>
>>> On Thu, May 24, 2018 at 1:17 PM, Sergej Jurečko <
>>> sergej.jurecko@REDACTED> wrote:
>>>
>>>> OTP-21 rc1 has enhanced IO scalability. Have you tried if it is any
>>>> better? UDP performance in Erlang was never great...
>>>>
>>>> Regards,
>>>> Sergej
>>>>
>>>>
>>>> On 24 May 2018, at 12:03, Danil Zagoskin <z@REDACTED> wrote:
>>>>
>>>> Yes, we have {read_packets, 100} in receive socket options.
>>>>
>>>> On Thu, May 24, 2018 at 10:23 AM, Raimo Niskanen <
>>>> raimo+erlang-questions@REDACTED> wrote:
>>>>
>>>>> On Wed, May 23, 2018 at 06:28:55PM +0300, Danil Zagoskin wrote:
>>>>> > Hi!
>>>>> >
>>>>> > We have a performance problem receiving lots of UDP traffic.
>>>>> > There are a lot (about 70) of UDP receive processes, each handling
>>>>> about 1
>>>>> > to 10 megabits of multicast traffic, with {active, N}.
>>>>>
>>>>> Whenever someone has UDP receive performance problems one has to ask
>>>>> if you
>>>>> have seen the Erlang socket option {read_packets,N}?
>>>>>
>>>>> See http://erlang.org/doc/man/inet.html#setopts-2
>>>>>
>>>>> >
>>>>> > msacc summary on my OSX laptop, build from OTP master
>>>>> > c30309e799212b080c39ee2f91af3f9a0383d767 (Apr 19):
>>>>> >
>>>>> >
>>>>> > Thread alloc aux bifbusy_wait check_io emulator
>>>>> > ets gc gc_full nif other port send sleep
>>>>> > timers
>>>>> > scheduler 30.02% 0.92% 2.86% 24.66% 0.01% 9.61%
>>>>> > 0.03% 1.25% 0.20% 0.13% 2.34% 9.33% 0.41% 17.78%
>>>>> > 0.44%
>>>>> >
>>>>> >
>>>>> > Linux production server behaves the same way (we do not have
>>>>> extended msacc
>>>>> > there yet, so most of alloc goes to port).
>>>>> >
>>>>> > perf top (on Linux production) says there's a lot of unaligned
>>>>> memmove:
>>>>> >
>>>>> > 69.76% libc-2.24.so [.] __memmove_sse2_unaligned_erms
>>>>> > 6.13% beam.smp [.] process_main
>>>>> > 2.02% beam.smp [.] erts_schedule
>>>>> > 0.87% [kernel] [k] copy_user_enhanced_fast_string
>>>>> >
>>>>> >
>>>>> > I'll try to make a minimal example for this.
>>>>> > Maybe there are simple recommendations on optimizing this kind of
>>>>> load?
>>>>> >
>>>>> > --
>>>>> > Danil Zagoskin | z@REDACTED
>>>>>
>>>>> > _______________________________________________
>>>>> > erlang-questions mailing list
>>>>> > erlang-questions@REDACTED
>>>>> > http://erlang.org/mailman/listinfo/erlang-questions
>>>>>
>>>>>
>>>>> --
>>>>>
>>>>> / Raimo Niskanen, Erlang/OTP, Ericsson AB
>>>>> _______________________________________________
>>>>> erlang-questions mailing list
>>>>> erlang-questions@REDACTED
>>>>> http://erlang.org/mailman/listinfo/erlang-questions
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Danil Zagoskin | z@REDACTED
>>>> _______________________________________________
>>>> erlang-questions mailing list
>>>> erlang-questions@REDACTED
>>>> http://erlang.org/mailman/listinfo/erlang-questions
>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> Danil Zagoskin | z@REDACTED
>>>
>>> _______________________________________________
>>> erlang-questions mailing list
>>> erlang-questions@REDACTED
>>> http://erlang.org/mailman/listinfo/erlang-questions
>>>
>>>
>>
>
>
> --
> Danil Zagoskin | z@REDACTED
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://erlang.org/mailman/listinfo/erlang-questions
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20180524/5d76d1e8/attachment.htm>
More information about the erlang-questions
mailing list