[erlang-questions] Fwd: Chameneos.rednux micro benchmark
Kevin Scaldeferri
kevin@REDACTED
Sat Oct 11 18:59:27 CEST 2008
[sigh... hit Reply instead of Reply All]
Begin forwarded message:
> From: Kevin Scaldeferri <kevin@REDACTED>
> Date: October 10, 2008 6:31:27 PM PDT
> To: "Edwin Fine" <erlang-questions_efine@REDACTED>
> Subject: Re: [erlang-questions] Chameneos.rednux micro benchmark
>
>
> On Oct 10, 2008, at 4:19 PM, Edwin Fine wrote:
>
>> It was run with HiPE. It's mentioned on the page that has the
>> Erlang code.
>
> I was actually asking about when you ran it. I did know that the
> benchmark site uses HiPE.
>
>>
>> A run of vmstat showed a minimum of about 15,000 context switches
>> per second and often more. Without the program running, there were
>> only about 500 or so per second.
>> ...
>>
>> I ran again with vmstat and -smp disabled. vmstat showed no
>> noticable difference in the cs column when the program was running
>> compared to when it was not:
>>
>> ...
>>
>> Tentative conclusion: this benchmark makes SMP Erlang do an
>> excessive number of context switches. Is that because it is jumping
>> between cores, or because of inter-process communication between
>> cores? I can't answer that fully, but I can see what happens if we
>> retrict it to one core using one VM.
>
> This is not all that surprising. Consider the part of the benchmark
> where there are 3 chameneos participating. Each of them, and the
> parent, will likely end up on their own scheduler (on quad-core).
> They all send a message then go to sleep. The parent receives the
> messages, processes some, goes to sleep. Children wake up, get
> messages, send message, go to sleep. Repeat. You can see that for
> much of the time, many of the schedules have nothing to do, and
> their threads may be switched out.
>
> Without SMP, all the Erlang processes run in the same scheduler
> thread, and there is always work to be done, so no or few context
> switches.
>
> Of course, for the portion with 10 chameneos, there is more often
> work that can be done, but maybe still not enough to saturate all
> the cores all the time.
>
>>
>>
>> So if it's not CPU-bound (60% idle), and it's not memory capacity
>> bound (virtual memory usage only 78MB), and it's not disk or
>> network I/O bound, what is it?
>
> a) as explained above, there are synchronization requirements as
> part of the game that may make it difficult to saturate all the CPUs
>
> b) I also speculated that migrating processes from one thread (core)
> to another may be significant. I'm not really sure where to look in
> the OS stats to find evidence to support this. (I guess you'd want
> to see if the memory bus is saturated.)
>
>
>
> I should also point out that it seems like there is either a
> significant different between Erlang running on 2 and 4 cores, or
> between the chip architectures themselves. Running parallel
> versions of other benchmarks on my 2-core hardware, I usually find
> that the total CPU time used is only slightly higher than a single-
> process version. However, on the Alioth 4-core hardware, the total
> CPU usage is about double. (Look at the two Erlang version for
> binary-trees and mandelbrot). I am inclined to think Erlang is to
> blame, if only because the Haskell entries don't show the same
> behavior.
>
>
>
> -kevin
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20081011/c24e24be/attachment.htm>
More information about the erlang-questions
mailing list