SMP performance with hackbench
Wed Aug 26 23:23:51 CEST 2009
The major problem is lock contention on memory allocator locks. I've
made a hack that reduces the lock contention and it seems to solve the
problem. The final solution that can be released needs more work though.
It wont make it into R13B02, but will most likely make it into R13B03.
The lack of flowcontrol (that Roger Larsson noted) also caused problems
since memory usage sometimes increased very much with multiple
schedulers. The extra memory usage also damaged the performance. I made
my own version of the erlang hackbench (ehb.erl attached) which have
flowcontrol and passes about 100 bytes per message (on a 64-bit machine).
I tested the hacked smp emulator on a 2x quad-core machine using a linux
2.6.16 kernel. I ran ehb with 8 schedulers resp. 1 scheduler on the
hacked smp emulator and compared with the original hackbench_old.c
benchmark using pipes (which scaled best) and affinity ff resp. 1, i.e.,
allowing processes on 8 cores resp. 1 core. The speedups were similar
for ehb and hackbench_old.c and ranging from 4 to 8 depending on the
number of groups and loops used. Sometimes ehb performed better and
sometimes hackbench_old performed better.
Note that I compared with the smp emulator with 1 scheduler and not the
non-smp emulator, but I also compared with a linux kernel with smp
support in the affinity 1 case. The non-smp emulator performs much
better than the smp emulator with 1 scheduler when running this
benchmark since this benchmark hammers on things that uses locks, but
I'm guessing that we would see similar results when comparing the linux
kernel with smp support resp. without smp support also.
Rickard Green, Erlang/OTP, Ericsson AB.
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
More information about the erlang-questions