optimization tricks ?

Tue May 16 14:42:35 CEST 2000

There will be a better eprof module in R7, not an entirely new profiler.

The new eprof module will use the new local call trace facility in R7,
meaning that you don't have to trace-compile the modules to be traced.
Also, the new eprof module is faster (it will slow down your program less
than the old version).

I've had good experiences with profiling. I've have found several totally
unexpected bottle-necks which I would never have guessed otherwise.

In a C++ project some five years ago our program was too slow. Elobarate
re-designs were proposed. When I profiled the program with Quantify,
the bottle-neck turned out to be a short loop where an object was allocated
and then immediately destroyed each time the loop was executed. Moving
the object allocation out of the loop eliminated the entire speed problem.

With the help of the current version of eprof I found an bottle-neck in
the Erlang compiler. It was easy to eliminate the problem as soon as I
found it, but I would probably not have found the bottle-neck without
a profiler. (The problem in the compiler was fixed before R6 was released.)

BTW, there is a "caller" instruction that can be used in trace patterns
to get the caller of the current function. I suppose it could be used
to make call graphs.

/Bjorn

Scott Lystig Fritchie <scott@REDACTED> writes:

> >>>>> "fc" == Francesco Cesarini <cesarini@REDACTED> writes:
> 
> fc> Erlang methodology:
> 
> Since most worthwhile performance increases come from algorithmic
> changes, I'd have to agree with Francesco's methodology.  Discussion
> on the list about whether to go for "beauty" first is just a minor
> problem of choosing a decent algorithm ... for some value of decent.
> :-)
> 
> Once you've decided that you need more speed, and algorithmic changes
> have been exhausted (or are impractical), it's worth targeting your
> optimization efforts.  The Erlang profiling tools are weak, but they
> can help prevent you from optimizing code that has a minimal effect.
> Several times, profiling has saved me from doing work: the code that I
> thought was a performance problem really wasn't.  Being a lazy
> performance optimizer is almost always a good thing.  Look for the low
> hanging fruit first.
> 
> The "eprof" module in the Erlang distribution was giving me headaches
> (as discussed in this forum a few months back).  Ulf Wiger, I believe,
> told me to do what his team ended up doing when facing the same
> problem: write my own profiler.  I chose instead to hack "eprof".  The
> result isn't pretty, but it does a much better job than "eprof" does.
> 
> It still isn't as accurate as it should be, but it doesn't seem to be
> my fault.  The trace messages generated by the BEAM VM don't always
> cooperate.  I've seen cases where a gen_server callback looking
> something like (pulling from memory):
> 
> 	handle_call({foo, Bar, Baz}, From, State) ->
> 	    Reply = call_real_foo_func(Bar, Baz, State),
> 	    {reply, Reply, State);
> 
> 	call_real_foo_func(A, B, S) when tuple(A) ->
> 	    do_foo(A, B, S);
> 	call_real_foo_func(A, B, S) when atom(A) ->
> 	    do_some_other_foo(A, B, S).
> 
> ... would sometimes only tally one call to call_real_foo_func() when I
> *know* it's being called hundreds of times via handle_call().  {shrug}
> Given the rumors that R7 is going to have a new and improved profiler,
> I haven't worried much about it because it's infrequent.  (If anyone
> finds a bug in the code below, feel free to fix it and lemme know.  :-)
> 
> At any rate, the eprof-derived profiler code, dubbed slf_eprof.erl, is
> included below.
> 
> -Scott
> 
> P.S.  Oh, one other thing.  We learned a fair amount about our
> application by enabling profiling when compiling the BEAM interpreter.
> It's kindof a 10,000 meter overview of how things work, but it was
> worth doing.  Add "-pg" to the "TYPE_FLAGS" line in
> erts/emulator/Makefile.in, then a "./configure" at the top, then
> compile, run your app, then analyze with "gprof".
> 
> P.P.S.  Eprof and slf_eprof are both really annoying because they
> don't maintain call graphs.  (Difficult, perhaps impossible, with the
> current tracing messages.)  Hopefully the new profiler will be able to
> track a function's time and the time of its descendents.

-- 
Björn Gustavsson            Ericsson Utvecklings AB
bjorn@REDACTED      ÄT2/UAB/F/P
			    BOX 1505
+46 8 727 56 87 	    125 25 Älvsjö