[erlang-questions] SMP performance with hackbench
Jiang Wei
jwhust@REDACTED
Wed Aug 19 10:17:34 CEST 2009
I try hackbench with different processes number, the thread number will be
calculated as (GroupNumber * 40), that is, 20 senders and 20 receivers in
one group.
Group smp smp
Number 8:8 disable
1 0.605369 0.093884
10 4.951785 1.414963
50 17.635626 8.722577
100 25.744995 17.889164
200 56.963373 36.106899
The smp disbable is always better than the smp:8:8 case.
Oprofiles shows almost 30% of cpu time is used by pthread_mutex_*.
The migration logic may be responsible for it. I am looking into the erlang
scheduler code and hope I can find the reason there.
2009/8/19 Sean Cribbs <seancribbs@REDACTED>
> In a dual quad-core setup, consider that there will be different
> message-passing speeds between:
>
> 1) cores on the same chip
> 2) cores on different chips
> 3) in some cases, cores in different combinations on the same chip (e.g.
> Nehalem quad-core processors have paired cores with some shared cache)
>
> If you're crossing boundaries between chips/cores frequently, you have to
> go through a cache or main RAM, which will be slower than running all on the
> same core. Try increasing the number of processes dramatically in your test
> and see how the SMP vs. non-SMP scenario pans out. 40 processes could be
> considered a small number of processes for an Erlang application.
>
> Sean Cribbs
>
> Jiang Wei wrote:
>
>> Hi, list
>> I write hackbench in erlang to test the performance, which is
>> originally a benchmark for linux scheduler.
>> (Hackbench contains several groups; each groups contains 20 pairs of
>> senders and receivers; each sender needs to send some messages to the 20
>> receivers in the same group. The performance is measured by the time taken,
>> less is better.)
>> The tests are carried out on an Intel server with 2 quad-core
>> processors and 4G memory.
>> I am surprised with results I got:
>> (1) SMP enable +S 8
>> root@REDACTED:~/hackbench# \time ./run_one_erl.sh
>> Time is 62.260995
>> 295.67user 110.62system 1:14.27elapsed 546%CPU (0avgtext+0avgdata
>> 0maxresident)k
>> 11776inputs+8outputs (27major+90965minor)pagefaults 0swaps
>> The performance is 62 sec and the oprofile shows 28% cpu time is
>> using in pthread_mutex_*.
>> (2) SMP disable
>> root@REDACTED:~/hackbench <mailto:root@REDACTED:%7E/hackbench># \time
>> ./run_one_erl.sh "-smp disable"
>> Time is 54.14644
>> 54.23user 0.33system 1:05.66elapsed 83%CPU (0avgtext+0avgdata
>> 0maxresident)k
>> 3968inputs+8outputs (22major+36520minor)pagefaults 0swaps
>> The performance is 54 sec and using only 83% cpu.
>> So it seems the erlang has problems with using all the smp
>> resources for serious lock contention in smp scheduler. Am I right?
>> And because I am new to erlang, the hackbench.erl may be in bad
>> encoding, which will harm the performance. Can anyone help me review my
>> code?
>> I attach both the original C version of hackbench and my erlang
>> version one.
>> Thanks a lot!
>> (I am sorry If it is the wrong place to post this letter.)
>> --
>> Best Regards,
>> Jiang, Wei
>> ------------------------------------------------------------------------
>>
>>
>> ________________________________________________________________
>> erlang-questions mailing list. See http://www.erlang.org/faq.html
>> erlang-questions (at) erlang.org
>>
>
>
--
Best Regards,
Jiang, Wei
More information about the erlang-questions
mailing list