[erlang-questions] Heads-up: The cost of get_stacktrace()

Tue Nov 5 21:36:15 CET 2013

(Executive summary: exceptions cheap, but erlang:get_stacktrace() kind 
of expensive; also, avoid 'catch Expr'.)

We have wrestled for some time with some very strange unresponsiveness 
and high amounts of garbage collection, and finally managed to track 
down the problem. It was in a piece of code that matches some input data 
against a number of different "patterns", trying one possibility at a 
time in a failure-driven loop. The actual problem was a wrapper that 
executed each call in a try/catch, with a default catch clause looking 
something like this:

   try match(Pattern, Data)
   catch
     ...
     Class:Term ->
       {foo_error, {Class, Term, erlang:get_stacktrace()}}
   end

The stacktrace was usually discarded again further up and the whole 
thing was retried with another "pattern". This code got executed tens of 
thousands of times per second. When we removed the call to 
get_stacktrace(), the system instantly started to behave much better.

The purpose of this mail, then, is both to warn about sloppy use of 
get_stacktrace() and to clarify how stack traces are handled and wherein 
the costs lie.

First of all, triggering an exception is quite cheap. The necessary 
stack trace information (by default, 8 pointers) is quickly saved in an 
opaque blob, and control gets passed up to the nearest catch handler (if 
there is one). If there is a handler, normal Erlang execution will 
resume, trying to match the catch-clauses. If no catch clause matches, 
the exception state is unchanged and we look for the next catch handler, 
until either some catch clause matches or the top of the call stack is 
reached (which will terminate the process).

If a catch clause matches, execution just continues and no extra cost is 
incurred as long as you don't try to inspect the stack trace. If none of 
the clauses match, the only cost was that of trying the clause patterns 
and guards. For example, terminating a process by "exit(normal)" has 
very little overhead even if it passes through a number of catch 
handlers that just pass it on upwards, because even the process exit 
signal will not contain the stack trace. And using throw/catch for 
nonlocal return out of a deep recursion is very cheap.

*But* if someone wants to actually look at the stack trace of the 
exception, the "opaque blob" mentioned above must be reified as an 
Erlang term, by calling erlang:get_stacktrace(). This amounts to looking 
up the module and function name and arity corresponding to each of the 
saved code pointers, and creating a corresponding list of MFA tuples on 
the heap. (This also happens if the process terminates due to an 
exception of type 'error' or 'throw', to include the stack trace in the 
exit signal.)

In addition, as of Erlang/OTP R15 this operation is 4-5 times(!) more 
expensive than it used to be pre-R15, because now the stack trace also 
includes file names and line numbers. That's more data to be allocated 
on the heap, but most of the cost is probably in traversing the tables 
that map bytecode regions to corresponding source file regions (these 
tables are created by the compiler and are included in the .beam files). 
For us, this difference meant that we went from "mysteriously high 
activity, but not critical" under R14 to "random bursts of 
unresponsiveness" under R15, and it took us a lot of effort to figure 
out what was going on.

So the general advice is: Don't call erlang:get_stacktrace() just 
because you can. If you don't have a real reason for catching every 
possible exception, just let the uninteresting ones fall through. Avoid 
the temptation to have a catch-all clause like in the example above, 
that re-packages the exception wrapped in a tuple with some tag that you 
happen to like. In particular if there's a chance that the code will be 
re-tried over and over again. If you don't intend to handle the 
exception, then let it remain an exception for as long as possible and 
don't turn the stack trace into a term, because that's when you pay.

It's of course still valid to call get_stacktrace() in many situations, 
e.g. when the process is on its way to give up, or to write the crash 
information to a log, or for something that only happens rarely and the 
stack trace information is useful - but never in a library function that 
might be used heavily in a loop.

Finally, this is also another reason to rewrite old occurrences of 
'catch Expr' into 'try Expr catch ... end', because it basically works 
like this:

   try Expr
   catch
     throw:Term -> Term;
      exit:Term -> {'EXIT', Term};
     error:Term -> {'EXIT', {Term, erlang:get_stacktrace()}}
   end

so what happens if you use one of the following old idioms?:

   ...
   catch foo(...),  % for side effect, ignore the result
   ...

or

   case catch foo(...) of
     {'EXIT', Reason} -> ...;
     Result -> ...
   end

Well, when the exception type is 'error', the catch will build a result 
containing the symbolic stack trace, and this will then in the first 
case be immediately discarded, or in the second case matched on and then 
possibly discarded later. Whereas if you use try/catch, you can ensure 
that no stack trace is constructed at all to begin with.

Sorry that this got a bit long, but I think that was all.

     /Richard