SMP performance with hackbench

Wed Aug 26 23:23:51 CEST 2009

The major problem is lock contention on memory allocator locks. I've 
made a hack that reduces the lock contention and it seems to solve the 
problem. The final solution that can be released needs more work though. 
It wont make it into R13B02, but will most likely make it into R13B03.

The lack of flowcontrol (that Roger Larsson noted) also caused problems 
since memory usage sometimes increased very much with multiple 
schedulers. The extra memory usage also damaged the performance. I made 
my own version of the erlang hackbench (ehb.erl attached) which have 
flowcontrol and passes about 100 bytes per message (on a 64-bit machine).

I tested the hacked smp emulator on a 2x quad-core machine using a linux 
2.6.16 kernel. I ran ehb with 8 schedulers resp. 1 scheduler on the 
hacked smp emulator and compared with the original hackbench_old.c 
benchmark using pipes (which scaled best) and affinity ff resp. 1, i.e., 
allowing processes on 8 cores resp. 1 core. The speedups were similar 
for ehb and hackbench_old.c and ranging from 4 to 8 depending on the 
number of groups and loops used. Sometimes ehb performed better and 
sometimes hackbench_old performed better.

Note that I compared with the smp emulator with 1 scheduler and not the 
non-smp emulator, but I also compared with a linux kernel with smp 
support in the affinity 1 case. The non-smp emulator performs much 
better than the smp emulator with 1 scheduler when running this 
benchmark since this benchmark hammers on things that uses locks, but 
I'm guessing that we would see similar results when comparing the linux 
kernel with smp support resp. without smp support also.

Regards,
Rickard
-- 
Rickard Green, Erlang/OTP, Ericsson AB.
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: ehb.erl
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20090826/6f901d12/attachment.ksh>