[erlang-questions] Chameneos.rednux micro benchmark
Edwin Fine
erlang-questions_efine@REDACTED
Sat Oct 11 01:19:14 CEST 2008
It was run with HiPE. It's mentioned on the page that has the Erlang code.
I was curious so I ran it on my configuration (Ubuntu 8.04 x86_64, 2.4GHz
Q6600, 8GB RAM, Erlang R12B-4).
Here's the bottom line.
*When I ran it with -smp disable there was a* *monumentally huge* *(91.8x) *
*performance boost* *over when I ran it with smp enabled*. That was without
HiPE. Adding HiPE speeded the SMP version up by about 20% or so. The
performance difference between the non-SMP HiPE and non-HiPE versions was
about 80x.
# *SMP without HiPE
*$ time /usr/local/bin/erl +K true -noshell -run chameneosredux main 6000000
*real 11m24.231s
*user 20m25.933s
sys 1m49.315s
# *SMP with HiPE
* $ time /usr/local/bin/erl +K true -noshell -run chameneosredux main
6000000
*real 9m19.138s*
user 16m28.374s
sys 1m49.899s
# *SMP disabled, without HiPE
*$ time /usr/local/bin/erl -smp disable +K true -noshell -run chameneosredux
main 6000000
*real 0m7.451s
*user 0m7.404s
sys 0m0.048s
# *SMP disabled, with HiPE
*$ time /usr/local/bin/erl -smp disable +K true -noshell -run chameneosredux
main 6000000
*real 0m6.970s
*user 0m6.864s
sys 0m0.104s
So if it's not CPU-bound (60% idle), and it's not memory capacity bound
(virtual memory usage only 78MB), and it's not disk or network I/O bound,
what is it?
A run of vmstat showed a minimum of about *15,000 context switches per
second* and often more. Without the program running, there were *only about
500 or so per second*.
procs -----------memory---------- ---swap-- -----io---- -system--
----cpu----
r b swpd free buff cache si so bi bo in cs us sy id
wa
6 0 371656 651900 646988 967940 0 0 0 0 67 15781 44 1 55
0
3 0 371656 651900 646992 967936 0 0 0 4 179 18291 44 2 54
0
1 0 371656 651900 646992 967940 0 0 0 0 61 28605 44 2 54
0
2 0 371656 651892 646992 967940 0 0 0 0 156 15751 42 1 57
0
3 0 371656 651900 646992 967940 0 0 0 0 61 78329 47 6 48
0
3 0 371656 651776 646992 967940 0 0 0 0 156 15081 43 1 56
0
3 0 371656 651776 646992 967940 0 0 0 0 61 15800 44 1 55
0
2 0 371656 651776 646996 967936 0 0 0 4 157 15393 44 1 55
0
2 0 371656 651760 646996 967940 0 0 0 0 62 16251 46 1 53
0
2 0 371656 651776 646996 967940 0 0 0 0 156 35213 43 4 53
0
I ran again with vmstat and -*smp disabled*. vmstat showed *no noticable
difference* in the cs column *when the program was running compared to when
it was not*:
<program not running>
0 0 371656 707156 624840 947328 0 0 0 28 110 453 0 0 100
0
0 0 371656 707148 624840 947328 0 0 0 0 285 775 0 0 99
0
0 0 371656 707140 624840 947328 0 0 0 0 102 537 0 0 100
0
0 0 371656 707148 624840 947328 0 0 0 0 172 597 0 0 100
0
0 0 371656 707148 624840 947328 0 0 0 0 113 532 0 0 100
0
1 0 371656 700460 624840 947328 32 0 32 0 204 1290 24 1 75
0
<program started here>
1 0 371656 700436 624840 947328 0 0 0 0 71 655 28 0 72
0
1 0 371656 700412 624844 947324 0 0 0 44 302 994 26 0 74
0
1 0 371656 700436 624844 947328 0 0 0 0 98 520 26 0 74
0
1 0 371656 700436 624844 947328 0 0 0 0 156 558 25 0 75
0
1 0 371656 700428 624844 947328 0 0 0 0 76 431 25 0 74
0
Tentative conclusion: this benchmark makes SMP Erlang do an excessive number
of context switches. Is that because it is jumping between cores, or because
of inter-process communication between cores? I can't answer that fully, but
I can see what happens if we retrict it to one core using one VM.
And the answer is: *vmstat shows that using taskset and +S 1, the context
switches go down to about the same level as if you were not running with SMP
*. It's still about 3x slower than -smp disable, though, but orders of
magnitude faster than using all processors with SMP.
* # SMP, Without HiPE, one scheduler, affinitied to CPU 2*
$ time taskset -c 2 /usr/local/bin/erl +S 1 -noshell -noinput -run
chameneosredux main 6000000
*real 0m24.296s*
user 0m24.270s
sys 0m0.004s
Last one. What about using +K true to user kernel poll? No significant
difference.
*real 0m24.006s*
user 0m23.998s
sys 0m0.012s
I tried capturing the output of various runs using strace but it's going to
take me a while to interpret the results (if I can even do that) and rerun
it until it makes sense. I don't know if it makes sense to try to use strace
with Erlang. I'll have to do some Googling.
Regards,
Edwin Fine
2008/10/10 Kevin Scaldeferri <kevin@REDACTED>
>
> On Oct 10, 2008, at 12:54 PM, Greg Burri wrote:
>
> Hi,
> I'm very surprise to see the differences between these two same benchmarks
> on shootout.alioth.debian.org :
> 1) Quad core :
> http://shootout.alioth.debian.org/u64q/benchmark.php?test=chameneosredux&lang=hipe
> 2) Mono core :
> http://shootout.alioth.debian.org/u64/benchmark.php?test=chameneosredux&lang=hipe
>
> Here are the CPU times :
> 1) 2095.18 s
> 2) 37.03 s
>
> I try on my machine[1] with a) "-smp enable" and b) "-smp disable" :
> a) 47.863 s
> b) 18.285
>
> Maybe It's not strange to see a such difference because of inter-cpu
> message passing. But the difference on shootout.alioth.debian.org is too
> large
> What should we do ?
>
>
> Are you using HiPE? There's some chance that could explain some of the
> relative difference.
>
> I don't think message passing is the issue. I suspect it's lack of process
> affinity. The chameneos processes are likely getting bounced around between
> schedulers constantly.
>
> In the short term, I'm not sure what can be done other than requesting the
> benchmark be run with '-S 1', which really kinda defeats the purpose. It
> would be nice to have a different solution, as I agree that this situation
> is pretty embarrassing. This task is such a natural for the actor-model and
> Erlang; it's a shame the performance ends up being so poor.
>
> -k
>
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://www.erlang.org/mailman/listinfo/erlang-questions
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20081010/076528b9/attachment.htm>
More information about the erlang-questions
mailing list