[erlang-questions] Cost of doing +sbwt?

Jesper Louis Andersen jesper.louis.andersen@REDACTED
Tue Sep 1 15:37:40 CEST 2015


On Tue, Sep 1, 2015 at 9:16 AM, Paul Davis <paul.joseph.davis@REDACTED>
wrote:

> Our first data point was to try and look at strace. What we noticed
> was that scheduler threads seemed to spend an inordinate amount of
> time in futex system calls. An strace run on a scheduler thread showed
> more than 50% of time in futex sys calls.
>

I may have other comments later, but this should tip you off as to what
happens. A futex() syscall is made whenever a lock is contended. The
uncontended case can be handled with no kernel invocation. If you spend
your time here, you are contended on some resource somewhere inside the
system.

Like Hynek, if you are running on the bare metal and not in some puny
hypervisor, then setting something like `+sbt db` is often worth it. It
binds schedulers to physical cores so they don't jump around and destroys
your TLBs and caches into oblivion.

I'd have two paths I'd continue on this: lockcnt instrumentation in a
staging environment and looking at where that contention is. Try to
reproduce it. Or pray to god you are running on FreeBSD/Illumos in
production in which case you can find the lock contention with a 5 line
DTrace script on the production cluster :)

Also, look at the current scheduler utilization!
erlang:statistics(scheduler_wall_time) (read the man page, you need a
system_flag too). You want to look at how much time the schedulers are
spending doing useful work and how much time they are just spinning waiting
for more work to come in. Though the high CPU count you are seeing more
suggests a lock that is contended to me.


-- 
J.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20150901/c9da79b1/attachment.htm>


More information about the erlang-questions mailing list