[erlang-questions] SMP performance with hackbench
Jiang Wei
jwhust@REDACTED
Wed Aug 19 10:32:07 CEST 2009
I use "-define(DATA, 1)." in hackbench.erl, but the result didn't change
much.
The smp disable is still better then the smp:8:8 ones.
Erlang R13B01 (erts-5.7.2) [source] [64-bit] [smp:8:8] [rq:8]
[async-threads:0] [hipe] [kernel-poll:false]
2> hackbench:main(300,1000).
66.791329
3> hackbench:main(300,1000).
68.51495800000001
smp disable
Erlang R13B01 (erts-5.7.2) [source] [64-bit] [rq:1] [async-threads:0] [hipe]
[kernel-poll:false]
2> hackbench:main(300,1000).
43.533186
3> hackbench:main(300,1000).
43.91207
The hackbench heavily rely on the scheduler performance. Because it needs to
scheduler the right pair of sender and receiver.
So I think the problem is in the smp logic, maybe it's the migration logic.
2009/8/19 Ulf Wiger <ulf.wiger@REDACTED>
>
> One thing you could try is to eliminate the shared binary
> and send a simple message instead, e.g.
>
> -define(DATA, 1).
>
> I don't know if it will make a big difference. Ideally,
> passing a shared binary will be as efficient, but this
> is at least a logical exclusion step.
>
> BR,
> Ulf W
>
> Jiang Wei wrote:
>
>> The test machine topology is [(0,1,4,5), (2,3,6,7)], and
>> erlang:system_info(cpu_topology) outputs:
>>
>> 1> erlang:system_info(cpu_topology).
>> [{processor,[{core,{logical,0}},
>> {core,{logical,4}},
>> {core,{logical,1}},
>> {core,{logical,5}}]},
>> {processor,[{core,{logical,2}},
>> {core,{logical,6}},
>> {core,{logical,3}},
>> {core,{logical,7}}]}]
>>
>> So it's right.
>> Then I bind schedulers to cpu cores:
>>
>> 2> erlang:system_flag(scheduler_bind_type,default_bind).
>> unbound
>> 3> erlang:system_info(scheduler_bindings).
>> {0,2,4,6,1,3,5,7}
>>
>> Re-run the hackbench:
>>
>> 4> c(hackbench).
>> ./hackbench.erl:56: Warning: variable 'Msg' is unused
>> {ok,hackbench}
>> 5> hackbench:main(300,1000). 71.174117
>> // 300 groups, each groups has 20 pairs of processes, total
>> 300*(20*2)=12000 processes, msg is sent 1000 times
>> 6> hackbench:main(300,1000).
>> 75.165799
>>
>> without binding and everything is in default:
>>
>> 3> hackbench:main(300,1000).
>> 67.151053
>> 4> hackbench:main(300,1000).
>> 72.056573
>>
>> It doesn't change much.
>> With smp disable:
>>
>> 2> hackbench:main(300,1000).
>> 53.942253
>>
>> *More info is in the attachment. (including uname -a, /etc/issue,
>> /proc/cpuinfo, erlang version, gcc version)
>> 2009/8/19 Zoltan Lajos Kis <kiszl@REDACTED <mailto:kiszl@REDACTED>>
>>
>>
>>
>> Hi,
>> First check if the cpu topology is properly identified:
>> erlang:system_info(cpu_topology). If not, set it manually:
>> erlang:system_flag(cpu_topology, Topo). (see slide* 27 for Topo).
>> Then bind the schedulers to cpu cores:
>> erlang:system_flag(scheduler_bind_type,default_bind). Check that the
>> binding succeeded: erlang:system_info(scheduler_bindings).
>> Try the SMP test again with these settings, and please tell us the
>> new results.
>> *see slides 22-28 in Kenneth's talk on multicore:
>>
>> http://www.erlang-factory.com/upload/presentations/105/KennethLundin-ErlangFactory2009London-AboutErlangOTPandMulti-coreperformanceinparticular.pdf
>> Regards,
>> Zoltan.
>> Jiang Wei wrote:
>>
>> Hi, list
>> I write hackbench in erlang to test the performance, which
>> is originally a benchmark for linux scheduler.
>> (Hackbench contains several groups; each groups contains 20
>> pairs of senders and receivers; each sender needs to send some
>> messages to the 20 receivers in the same group. The performance
>> is measured by the time taken, less is better.)
>> The tests are carried out on an Intel server with 2
>> quad-core processors and 4G memory.
>> I am surprised with results I got:
>> (1) SMP enable +S 8
>> root@REDACTED:~/hackbench# \time ./run_one_erl.sh
>> Time is 62.260995
>> 295.67user 110.62system 1:14.27elapsed 546%CPU
>> (0avgtext+0avgdata 0maxresident)k
>> 11776inputs+8outputs (27major+90965minor)pagefaults 0swaps
>> The performance is 62 sec and the oprofile shows 28% cpu
>> time is using in pthread_mutex_*.
>> (2) SMP disable
>> root@REDACTED:~/hackbench <mailto:root@REDACTED
>> <mailto:root@REDACTED>:%7E/hackbench># \time ./run_one_erl.sh "-smp
>> disable"
>> Time is 54.14644
>> 54.23user 0.33system 1:05.66elapsed 83%CPU
>> (0avgtext+0avgdata 0maxresident)k
>> 3968inputs+8outputs (22major+36520minor)pagefaults 0swaps
>> The performance is 54 sec and using only 83% cpu.
>> So it seems the erlang has problems with using all the
>> smp resources for serious lock contention in smp scheduler. Am I
>> right?
>> And because I am new to erlang, the hackbench.erl may be in
>> bad encoding, which will harm the performance. Can anyone help
>> me review my code?
>> I attach both the original C version of hackbench and my
>> erlang version one.
>> Thanks a lot!
>> (I am sorry If it is the wrong place to post this letter.)
>> -- Best Regards,
>> Jiang, Wei
>>
>> ------------------------------------------------------------------------
>>
>>
>> ________________________________________________________________
>> erlang-questions mailing list. See http://www.erlang.org/faq.html
>> erlang-questions (at) erlang.org <http://erlang.org/>
>>
>>
>>
>> --
>> Best Regards,
>> Jiang, Wei
>>
>>
>> ------------------------------------------------------------------------
>>
>>
>> ________________________________________________________________
>> erlang-questions mailing list. See http://www.erlang.org/faq.html
>> erlang-questions (at) erlang.org
>>
>
>
> --
> Ulf Wiger
> CTO, Erlang Training & Consulting Ltd
> http://www.erlang-consulting.com
>
--
Best Regards,
Jiang, Wei
More information about the erlang-questions
mailing list