The mysteries of eprof
Thu Feb 24 12:10:55 CET 2000
Thanks for your comments and observations.
There will be a better eprof for R7 (the next major OTP release).
How much better we don't know yet since it is not finished, but at
least the eprof process itself will have less overhead than in the
current version (i.e. your own code will run *somewhat* slower when
profiling, not *much* slower as with the current version of eprof).
Maybe I will mail more information when the new version of eprof
is finished. (BTW, the new version of eprof will not run on R6, because
it will be dependent on new features in R7.)
Scott Lystig Fritchie <scott@REDACTED> writes:
> Good morning. I hope this message is coherent: it's been a long
> I've been fiddling a lot with eprof, trying to plumb its secrets.
> It's difficult to get used to, because it definitely isn't "gprof".
> Dealing strictly with wall-clock times is, well, difficult. I suppose
> the ideal tool would be able to count the CPU time consumed by the
> reductions made by a function *and* the CPU time consumed by BIFs
> *and* somehow account for CPU time consumed by scheduling, driver
> overhead, etc. Oh, and while I'm dreaming, we might as well sum all
> the time consumed by function F plus all the functions F calls. :-)
> Anyhow, there have been a few things I've discovered this week that
> don't make a lot of sense. One is actually eprof-specific. The other
> is more scheduling-oriented.
> The eprof question is: if I'm profiling an already-running process,
> and I let it go for N seconds, why doesn't total_analyse() report N
> seconds (or "close to N") worth of activity?
> I've been able to answer part of the question: if the process was
> asleep waiting for a message, the time from when eprof:profile() was
> called and when the message arrives.
> If that process never receives a message, then eprof thinks it did
> nothing. It *did* do nothing, but it would be less misleading if
> eprof could say it spent 100% of its time within the blocking
> function. Giving my understanding of how eprof & process tracing
> works (wall-clock timestamped messages are sent to the eprof process
> whenever a function is entered, exited, or calls/returns from another
> function), eprof's silence makes sense.
> When I point eprof at processes that are fairly busy,
> eprof:total_analyse() sums almost the entire amount of wall-clock time
> spent during the profiling. (This and all other examples use 20
> seconds between eprof:profile() and eprof:stop_profiling().) For
> FUNCTION CALLS TIME SECONDS
> gen_server:loop_wait/1 489 98. % 18.2782
> callback_server_hack:get_key/2 735 0.3 % 0.0586
> callback_server_hack:handle_call/3 489 0.3 % 0.0571
> gen_server:loop/6 489 0.1 % 0.0340
> gen_server:reply/2 489 0.1 % 0.0253
> gen_server:handle_msg/6 489 0.1 % 0.0188
> (These reports have been hacked to show the amount of wall-clock time
> in the rightmost column. If I remember, I'll include a patch below.
> The relative time percentage formatting is ugly, but it's good enough
> for me....)
> That's pretty close to 20 seconds. It'd be nice to know how eprof
> missed the other ~1.5 seconds, but I'm not that curious...
> ... unless it's the same reason that causes this:
> FUNCTION CALLS TIME SECONDS
> gen_tcp:call/2 928 58. % 3.6845
> gen:wait_resp_mon/3 1009 15. % 0.9692
> fnstuff:svc_tcp_main/5 906 8.0 % 0.5042
> It's still a 20 second sample. If I included all the report, the
> wall-clock time only adds up to 6.6 seconds. Since I'm reallyreally
> interested in profiling what this process is doing, it would be
> wonderful to know where the phantom time is.
> That second example leads me to my other question: why would
> gen_tcp:call/2 suck up so much time? (In my application,
> gen_udp:call/2 does the same thing.) To be honest, I believe I've
> answered that question, but the answer doesn't help (much) to fix the
> underlying problem.
> gen_tcp:call/2 looks like a classic example of what make_ref() is used
> for: uniquely tagging messages so that you can grab only the response
> to that packet. Process-wise, things look like:
> my process <--> inet_tcp:socket_loop/3 <--> Port handling the <-> The
> and friends socket Network
> As far as I can tell, the reason gen_tcp:call/2 takes so much time is
> that my system is actually busy enough that the message sent from my
> process -> inet_tcp:socket_loop/3 (via gen_tcp:send()) takes about
> 2.3ms (per call!) to be "receive"d by the recipient process. Almost
> immediately, the recipient sends back an acknowledgement, but it takes
> that long again before the ack is "receive"d by my process.
> Ideally, that data should be *on*the*wire* as soon as possible. I'd
> love to get rid of most of that 2.3ms latency, but I'm at a loss how.
> Well, short of playing games with the BEAM process scheduler.(**) Or
> reinventing the gen_tcp and gen_udp wheels by having my process talk
> directly to the port. :-(
> Any ideas from enlightened subscribers?
> (*) Good thing I started it shortly before noon!
> (**) I'm sure my colleages will be delighted to know I've been
> contemplating such things ... like a "!!" or some other hack to put
> the sending process to sleep ASAP and wake up the recipient ASAP.
> Scott Lystig Fritchie, <scott@REDACTED>
> 5401 - 10th Ave S, Minneapolis, MN 55417 USA
> office: 612.827.2835, cell: 612.805.1383
> Professional Governing: Is It Faked?
Björn Gustavsson Ericsson Utvecklings AB
125 25 Älvsjö
More information about the erlang-questions