<html><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space; ">[sigh... hit Reply instead of Reply All]<br><div><br><div>Begin forwarded message:</div><br class="Apple-interchange-newline"><blockquote type="cite"><div><div style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; "><font face="Helvetica" size="3" color="#000000" style="font: 12.0px Helvetica; color: #000000"><b>From: </b></font><font face="Helvetica" size="3" style="font: 12.0px Helvetica">Kevin Scaldeferri <<a href="mailto:kevin@scaldeferri.com">kevin@scaldeferri.com</a>></font></div><div style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; "><font face="Helvetica" size="3" color="#000000" style="font: 12.0px Helvetica; color: #000000"><b>Date: </b></font><font face="Helvetica" size="3" style="font: 12.0px Helvetica">October 10, 2008 6:31:27 PM PDT</font></div><div style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; "><font face="Helvetica" size="3" color="#000000" style="font: 12.0px Helvetica; color: #000000"><b>To: </b></font><font face="Helvetica" size="3" style="font: 12.0px Helvetica">"Edwin Fine" <<a href="mailto:erlang-questions_efine@usa.net">erlang-questions_efine@usa.net</a>></font></div><div style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; "><font face="Helvetica" size="3" color="#000000" style="font: 12.0px Helvetica; color: #000000"><b>Subject: </b></font><font face="Helvetica" size="3" style="font: 12.0px Helvetica"><b>Re: [erlang-questions] Chameneos.rednux micro benchmark</b></font></div><div style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; min-height: 14px; "><br></div> </div><div style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space; "><br><div><div>On Oct 10, 2008, at 4:19 PM, Edwin Fine wrote:</div><br class="Apple-interchange-newline"><blockquote type="cite"><div dir="ltr">It was run with HiPE. It's mentioned on the page that has the Erlang code.</div></blockquote><div><br></div><div>I was actually asking about when you ran it. I did know that the benchmark site uses HiPE.</div><div><br></div><blockquote type="cite"><div dir="ltr"><br>A run of vmstat showed a minimum of about <b>15,000 context switches per second</b> and often more. Without the program running, there were <b>only about 500 or so per second</b>.<br>...<br><br>I ran again with vmstat and -<b>smp disabled</b>. vmstat showed <b>no noticable difference</b> in the cs column <b>when the program was running compared to when it was not</b>:<br> <br><font class="Apple-style-span" face="'courier new'" size="2"><span class="Apple-style-span" style="font-size: 10px;">...</span></font><br><br>Tentative conclusion: this benchmark makes SMP Erlang do an excessive number of context switches. Is that because it is jumping between cores, or because of inter-process communication between cores? I can't answer that fully, but I can see what happens if we retrict it to one core using one VM.</div></blockquote><div><br></div><div>This is not all that surprising. Consider the part of the benchmark where there are 3 chameneos participating. Each of them, and the parent, will likely end up on their own scheduler (on quad-core). They all send a message then go to sleep. The parent receives the messages, processes some, goes to sleep. Children wake up, get messages, send message, go to sleep. Repeat. You can see that for much of the time, many of the schedules have nothing to do, and their threads may be switched out.</div><div><br></div><div>Without SMP, all the Erlang processes run in the same scheduler thread, and there is always work to be done, so no or few context switches.</div><div><br></div><div>Of course, for the portion with 10 chameneos, there is more often work that can be done, but maybe still not enough to saturate all the cores all the time.</div><div><br></div><div><blockquote type="cite"><br><br>So if it's not CPU-bound (60% idle), and it's not memory capacity bound (virtual memory usage only 78MB), and it's not disk or network I/O bound, what is it?<br></blockquote><br></div><div>a) as explained above, there are synchronization requirements as part of the game that may make it difficult to saturate all the CPUs</div><div><br></div><div>b) I also speculated that migrating processes from one thread (core) to another may be significant. I'm not really sure where to look in the OS stats to find evidence to support this. (I guess you'd want to see if the memory bus is saturated.)</div><div><br></div><div><br></div><div><br></div><div>I should also point out that it seems like there is either a significant different between Erlang running on 2 and 4 cores, or between the chip architectures themselves. Running parallel versions of other benchmarks on my 2-core hardware, I usually find that the total CPU time used is only slightly higher than a single-process version. However, on the Alioth 4-core hardware, the total CPU usage is about double. (Look at the two Erlang version for binary-trees and mandelbrot). I am inclined to think Erlang is to blame, if only because the Haskell entries don't show the same behavior.</div><div><br></div><div><br></div><div><br></div><div>-kevin</div><div><br></div><div><br></div></div></div></blockquote></div><br></body></html>