[erlang-questions] UDP receive performance
Lukas Larsson
lukas@REDACTED
Thu May 24 12:24:37 CEST 2018
Can you run perf with "--call-graph dwarf" and see which functions it is
that call memmove?
On Thu, May 24, 2018 at 12:21 PM, Danil Zagoskin <z@REDACTED> wrote:
> Yes, I've built a fresh master today (Erlang/OTP 21 [RELEASE CANDIDATE 1]
> [erts-9.3.1]), and nothing has changed.
>
> On Thu, May 24, 2018 at 1:17 PM, Sergej Jurečko <sergej.jurecko@REDACTED>
> wrote:
>
>> OTP-21 rc1 has enhanced IO scalability. Have you tried if it is any
>> better? UDP performance in Erlang was never great...
>>
>> Regards,
>> Sergej
>>
>>
>> On 24 May 2018, at 12:03, Danil Zagoskin <z@REDACTED> wrote:
>>
>> Yes, we have {read_packets, 100} in receive socket options.
>>
>> On Thu, May 24, 2018 at 10:23 AM, Raimo Niskanen <
>> raimo+erlang-questions@REDACTED> wrote:
>>
>>> On Wed, May 23, 2018 at 06:28:55PM +0300, Danil Zagoskin wrote:
>>> > Hi!
>>> >
>>> > We have a performance problem receiving lots of UDP traffic.
>>> > There are a lot (about 70) of UDP receive processes, each handling
>>> about 1
>>> > to 10 megabits of multicast traffic, with {active, N}.
>>>
>>> Whenever someone has UDP receive performance problems one has to ask if
>>> you
>>> have seen the Erlang socket option {read_packets,N}?
>>>
>>> See http://erlang.org/doc/man/inet.html#setopts-2
>>>
>>> >
>>> > msacc summary on my OSX laptop, build from OTP master
>>> > c30309e799212b080c39ee2f91af3f9a0383d767 (Apr 19):
>>> >
>>> >
>>> > Thread alloc aux bifbusy_wait check_io emulator
>>> > ets gc gc_full nif other port send sleep
>>> > timers
>>> > scheduler 30.02% 0.92% 2.86% 24.66% 0.01% 9.61%
>>> > 0.03% 1.25% 0.20% 0.13% 2.34% 9.33% 0.41% 17.78%
>>> > 0.44%
>>> >
>>> >
>>> > Linux production server behaves the same way (we do not have extended
>>> msacc
>>> > there yet, so most of alloc goes to port).
>>> >
>>> > perf top (on Linux production) says there's a lot of unaligned memmove:
>>> >
>>> > 69.76% libc-2.24.so [.] __memmove_sse2_unaligned_erms
>>> > 6.13% beam.smp [.] process_main
>>> > 2.02% beam.smp [.] erts_schedule
>>> > 0.87% [kernel] [k] copy_user_enhanced_fast_string
>>> >
>>> >
>>> > I'll try to make a minimal example for this.
>>> > Maybe there are simple recommendations on optimizing this kind of load?
>>> >
>>> > --
>>> > Danil Zagoskin | z@REDACTED
>>>
>>> > _______________________________________________
>>> > erlang-questions mailing list
>>> > erlang-questions@REDACTED
>>> > http://erlang.org/mailman/listinfo/erlang-questions
>>>
>>>
>>> --
>>>
>>> / Raimo Niskanen, Erlang/OTP, Ericsson AB
>>> _______________________________________________
>>> erlang-questions mailing list
>>> erlang-questions@REDACTED
>>> http://erlang.org/mailman/listinfo/erlang-questions
>>>
>>
>>
>>
>> --
>> Danil Zagoskin | z@REDACTED
>> _______________________________________________
>> erlang-questions mailing list
>> erlang-questions@REDACTED
>> http://erlang.org/mailman/listinfo/erlang-questions
>>
>>
>>
>
>
> --
> Danil Zagoskin | z@REDACTED
>
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://erlang.org/mailman/listinfo/erlang-questions
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20180524/f927c9b9/attachment.htm>
More information about the erlang-questions
mailing list