[erlang-questions] gen_server bottleneck
Saravanan Vijayakumaran
sarva.v@REDACTED
Thu Dec 20 11:34:25 CET 2012
I tried running etop to profile the simulation. The commands I used and a
snapshot of the etop data are below. The nsime_simulator process is
undergoing a lot of reductions. But the message queue is zero. I was
expecting to see a lot of messages there because 10,000 event schedule
calls each result in a gen_server:call to nsime_simulator. The
nsime_simulator contains the event list gb_trees and hence it occupies a
lot of memory. So that is not surprising. The plists module is used to run
the simultaneous events in parallel. It is occupying the maximum memory. I
don't understand how it can occupy more memory than the nsime_simulator
itself. I will have to look at the code again to figure this out. I will
keep you posted. Thanks for all your suggestions.
Simulation command
erl -sname try -noshell +P 1000000 -pa ebin/ -pa examples/ebin/ -run
udp_cs_pairs start 10000 -run init stop
etop command:
/usr/lib/erlang/lib/observer-0.9.10/priv/bin/getop -node
try@REDACTED off
========================================================================================
'try@REDACTED'
10:26:35
Load: cpu 177 Memory: total 784544
binary 105148
procs 130041 processes 664962
code 1888
runq 4 atom 143
ets 144
Pid Name or Initial Func Time Reds Memory MsgQ Current
Function
----------------------------------------------------------------------------------------
<4696.2.0> erlang:apply/2 '-' 70800 13799852 0
plists:cluster_runmany
<4696.35.0> nsime_simulator '-' 2315977 10471532 0
gen_server:loop/6
<4696.31801.3> erlang:apply/2 '-' 70568 556908 1
gen:do_call/4
<4696.31804.3> erlang:apply/2 '-' 77260 344376 1
io:wait_io_mon_reply
<4696.31800.3> erlang:apply/2 '-' 72686 344360 0
io:wait_io_mon_reply
<4696.31802.3> erlang:apply/2 '-' 75552 344360 0
io:wait_io_mon_reply
<4696.31803.3> erlang:apply/2 '-' 74720 344360 0
io:wait_io_mon_reply
<4696.37.0> nsime_node_list '-' 0 185928 0
gen_server:loop/6
<4696.25.0> code_server '-' 0 142164 0
code_server:loop/1
<4696.38.0> nsime_channel_list '-' 0 142144 0
gen_server:loop/6
========================================================================================
I also get a bunch of messages in the etop window if I don't set "-tracing
off".
Erlang top dropped data 151
Erlang top got garbage {trace_ts,<4695.18979.3>,out,
{gen_server,loop,6},
{1355,998744,703693}}
Erlang top got garbage {trace_ts,<4695.8399.1>,out,
{gen,do_call,4},
{1355,998744,703731}}
On Sun, Dec 16, 2012 at 7:46 PM, Jesper Louis Andersen <
jesper.louis.andersen@REDACTED> wrote:
>
> On Dec 15, 2012, at 9:19 PM, Garrett Smith <g@REDACTED> wrote:
>
> > I've lost track of this thread -- are you still guessing, or have you
> > spent any time measuring? I looked back and couldn't see any
> > discussion about e.g. specific processes with high reduction counts,
> > growing queue sizes, etc.
>
> Sometimes, a well-placed eprof measurement does wonders to tell you what
> is slow. We had a system that was spending 40% of its time in a specific
> regular expression once, but this is hard to see unless you profile. I'd
> say you should look at processes with high reduction counts, or groups of
> those and then look into why they are reducing all the time.
>
> But I have a hunch that this has more to do with the fact that the
> simulation is serial more than parallel. If you want to utilize more than
> one core then, you need to figure out how the problem can be split up.
>
> Jesper Louis Andersen
> Erlang Solutions Ltd., Copenhagen
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20121220/98791c17/attachment.htm>
More information about the erlang-questions
mailing list