[erlang-questions] gen_server bottleneck

Thu Dec 20 11:34:25 CET 2012

I tried running etop to profile the simulation. The commands I used and a
snapshot of the etop data are below. The nsime_simulator process is
undergoing a lot of reductions. But the message queue is zero. I was
expecting to see a lot of messages there because 10,000 event schedule
calls each result in a gen_server:call to nsime_simulator. The
nsime_simulator contains the event list gb_trees and hence it occupies a
lot of memory. So that is not surprising. The plists module is used to run
the simultaneous events in parallel. It is occupying the maximum memory. I
don't understand how it can occupy more memory than the nsime_simulator
itself. I will have to look at the code again to figure this out. I will
keep you posted. Thanks for all your suggestions.

Simulation command
erl -sname try -noshell +P 1000000 -pa ebin/ -pa examples/ebin/ -run
udp_cs_pairs start 10000 -run init stop

etop command:
/usr/lib/erlang/lib/observer-0.9.10/priv/bin/getop -node
try@REDACTED off

========================================================================================
 'try@REDACTED'
10:26:35
 Load:  cpu       177               Memory:  total      784544
binary     105148
        procs  130041                        processes  664962
code         1888
        runq        4                        atom          143
ets           144

Pid            Name or Initial Func    Time    Reds  Memory    MsgQ Current
Function
----------------------------------------------------------------------------------------
<4696.2.0>     erlang:apply/2           '-'   70800 13799852       0
plists:cluster_runmany
<4696.35.0>    nsime_simulator          '-' 2315977 10471532       0
gen_server:loop/6
<4696.31801.3> erlang:apply/2           '-'   70568   556908       1
gen:do_call/4
<4696.31804.3> erlang:apply/2           '-'   77260   344376       1
io:wait_io_mon_reply
<4696.31800.3> erlang:apply/2           '-'   72686   344360       0
io:wait_io_mon_reply
<4696.31802.3> erlang:apply/2           '-'   75552   344360       0
io:wait_io_mon_reply
<4696.31803.3> erlang:apply/2           '-'   74720   344360       0
io:wait_io_mon_reply
<4696.37.0>    nsime_node_list          '-'       0   185928       0
gen_server:loop/6
<4696.25.0>    code_server              '-'       0   142164       0
code_server:loop/1
<4696.38.0>    nsime_channel_list       '-'       0   142144       0
gen_server:loop/6
========================================================================================

I also get a bunch of messages in the etop window if I don't set "-tracing
off".

Erlang top dropped data 151
Erlang top got garbage {trace_ts,<4695.18979.3>,out,
                                 {gen_server,loop,6},
                                 {1355,998744,703693}}
Erlang top got garbage {trace_ts,<4695.8399.1>,out,
                                 {gen,do_call,4},
                                 {1355,998744,703731}}

On Sun, Dec 16, 2012 at 7:46 PM, Jesper Louis Andersen <
jesper.louis.andersen@REDACTED> wrote:

>
> On Dec 15, 2012, at 9:19 PM, Garrett Smith <g@REDACTED> wrote:
>
> > I've lost track of this thread -- are you still guessing, or have you
> > spent any time measuring? I looked back and couldn't see any
> > discussion about e.g. specific processes with high reduction counts,
> > growing queue sizes, etc.
>
> Sometimes, a well-placed eprof measurement does wonders to tell you what
> is slow. We had a system that was spending 40% of its time in a specific
> regular expression once, but this is hard to see unless you profile. I'd
> say you should look at processes with high reduction counts, or groups of
> those and then look into why they are reducing all the time.
>
> But I have a hunch that this has more to do with the fact that the
> simulation is serial more than parallel. If you want to utilize more than
> one core then, you need to figure out how the problem can be split up.
>
> Jesper Louis Andersen
>   Erlang Solutions Ltd., Copenhagen
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20121220/98791c17/attachment.htm>