<div dir="ltr"><div>I tried running etop to profile the simulation. The commands I used and a snapshot of the etop data are below. The nsime_simulator process is undergoing a lot of reductions. But the message queue is zero. I was expecting to see a lot of messages there because 10,000 event schedule calls each result in a gen_server:call to nsime_simulator. The nsime_simulator contains the event list gb_trees and hence it occupies a lot of memory. So that is not surprising. The plists module is used to run the simultaneous events in parallel. It is occupying the maximum memory. I don't understand how it can occupy more memory than the nsime_simulator itself. I will have to look at the code again to figure this out. I will keep you posted. Thanks for all your suggestions.<br>

<br>Simulation command<br><span style="font-family:courier new,monospace">erl -sname try -noshell +P 1000000 -pa ebin/ -pa examples/ebin/ -run udp_cs_pairs start 10000 -run init stop</span><br><br></div><div>etop command:<br>

<span style="font-family:courier new,monospace">/usr/lib/erlang/lib/observer-0.9.10/priv/bin/getop -node try@sarva-dev -tracing off</span><br></div><span style="font-family:courier new,monospace"><br></span><div><span style="font-family:courier new,monospace"><br>

========================================================================================<br> 'try@sarva-dev'                                                           10:26:35<br> Load:  cpu       177               Memory:  total      784544    binary     105148<br>

        procs  130041                        processes  664962    code         1888<br>        runq        4                        atom          143    ets           144<br><br>Pid            Name or Initial Func    Time    Reds  Memory    MsgQ Current Function<br>

----------------------------------------------------------------------------------------<br><4696.2.0>     erlang:apply/2           '-'   70800 13799852       0 plists:cluster_runmany<br><4696.35.0>    nsime_simulator          '-' 2315977 10471532       0 gen_server:loop/6   <br>

<4696.31801.3> erlang:apply/2           '-'   70568   556908       1 gen:do_call/4       <br><4696.31804.3> erlang:apply/2           '-'   77260   344376       1 io:wait_io_mon_reply<br><4696.31800.3> erlang:apply/2           '-'   72686   344360       0 io:wait_io_mon_reply<br>

<4696.31802.3> erlang:apply/2           '-'   75552   344360       0 io:wait_io_mon_reply<br><4696.31803.3> erlang:apply/2           '-'   74720   344360       0 io:wait_io_mon_reply<br><4696.37.0>    nsime_node_list          '-'       0   185928       0 gen_server:loop/6   <br>

<4696.25.0>    code_server              '-'       0   142164       0 code_server:loop/1  <br><4696.38.0>    nsime_channel_list       '-'       0   142144       0 gen_server:loop/6   <br>========================================================================================<br>

<br><font face="arial,helvetica,sans-serif">I also get a bunch of messages in the etop window if I don't set "-tracing off".</font><br><br>Erlang top dropped data 151<br>Erlang top got garbage {trace_ts,<4695.18979.3>,out,<br>

                                 {gen_server,loop,6},<br>                                 {1355,998744,703693}}<br>Erlang top got garbage {trace_ts,<4695.8399.1>,out,<br>                                 {gen,do_call,4},<br>

                                 {1355,998744,703731}}<br></span><br></div></div><div class="gmail_extra"><br><br><div class="gmail_quote">On Sun, Dec 16, 2012 at 7:46 PM, Jesper Louis Andersen <span dir="ltr"><<a href="mailto:jesper.louis.andersen@erlang-solutions.com" target="_blank">jesper.louis.andersen@erlang-solutions.com</a>></span> wrote:<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="im"><br>

On Dec 15, 2012, at 9:19 PM, Garrett Smith <<a href="mailto:g@rre.tt">g@rre.tt</a>> wrote:<br>

<br>

> I've lost track of this thread -- are you still guessing, or have you<br>

> spent any time measuring? I looked back and couldn't see any<br>

> discussion about e.g. specific processes with high reduction counts,<br>

> growing queue sizes, etc.<br>

<br>

</div>Sometimes, a well-placed eprof measurement does wonders to tell you what is slow. We had a system that was spending 40% of its time in a specific regular expression once, but this is hard to see unless you profile. I'd say you should look at processes with high reduction counts, or groups of those and then look into why they are reducing all the time.<br>


<br>

But I have a hunch that this has more to do with the fact that the simulation is serial more than parallel. If you want to utilize more than one core then, you need to figure out how the problem can be split up.<br>

<span class="HOEnZb"><font color="#888888"><br>

Jesper Louis Andersen<br>

  Erlang Solutions Ltd., Copenhagen<br>

<br>

<br>

<br>

</font></span></blockquote></div><br></div>