[erlang-questions] Heavy duty UDP server performance

Lukas Larsson <>
Thu Feb 11 10:22:55 CET 2016


Hello,

Would you mind sending me the "perf script" output from your profiling runs
so that I can have a closer look?

For the record, CPU utilization is a terrible way to measure load on a
system, especially if you are running on much less than full throttle.
Because of how expensive it is to go to sleep, all the threads in the
Erlang VM will spin and consume CPU when it is out of work and if you are
running at 25% CPU then you are running out of work very often which will
cause more spinning which will increase CPU while not decreasing the amount
of work the system can do.

One way to work around that is to compact the load of the schedulers
better, so that instead of running at 25% on each scheduler, you run on
100% on one and 0% on the other three. The default config attempts to do
this, but it is a trade off in how fast you want to be able to react to an
increase in load and how much CPU you want to spend spinning, so it is not
as aggressive as it could be. You can change how aggressive it is by
setting the scheduler wakeup threshold so a higher value, i.e. "+swt
very_high". This may reduce some of the time that you see in scheduler_wait
when profiling.

Another way to effect the wait time is by changing the scheduler busy wait
threshold so that when the schedulers run out of work they will spin more
or less, i.e. "+sbwt short" or "+sbwt long". You will have to experiment
and see which is best for this specific benchmark.

Fun fact, in virtualized environments it is crazy crazy crazy expensive to
go to sleep. By compacting load we have seen systems get a reduction of up
to 10% CPU usage without changing anything but where the work is scheduled.

Lukas

On Wed, Feb 10, 2016 at 9:49 PM, Ameretat Reith <>
wrote:

> On Tue, 9 Feb 2016 16:19:07 +0100
> Jesper Louis Andersen <> wrote:
>
> > Where is that time spent in the Erlang VM or in the Kernel? You are
> > potentially on a wakeup schedule of 81 wakeups per millisecond to
> > handle packets. Which suggests you need to understand where your CPU
> > time is spent in the system in order to tune it for lower CPU usage.
>
> In more powerful machines that can handle 1Gbit/s with 50% CPU
> utilization, It's VM's scheduler_wait consuming about 30% of CPU usage.
> (xeon-e3 attachment) suggesting system can get more input.  In
> overloaded and less powerful machines, VM task management functions get
> more resources and time spent in kernel is less than 20%. (corei3
> attachment)
> _______________________________________________
> erlang-questions mailing list
> 
> http://erlang.org/mailman/listinfo/erlang-questions
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20160211/76490c77/attachment.html>


More information about the erlang-questions mailing list