[erlang-questions] Erlang tracing
Mon Sep 21 12:02:31 CEST 2015
> On 21 Sep 2015, at 10:52, Lukas Larsson <lukas@REDACTED> wrote:
> To start the discussion, here are a few of my ideas in no particular order:
> * Allow multiple tracers. Today only one port/process can be the receiver of trace data.
> * Create a couple of scalable high throughput tracing output backends with different overflow mechanics. Today all tracing is funneled through one bottleneck and has no overflow handling at all.
> * Expose vm probes (today dtrace probes) to the erlang tracer.
> * Better integration of dtrace/lttngt/systemtap into the erlang trace.
> * Allow the erlang tracer to be an Erlang callback module. Today only ports/processes are allowed.
> * Optimize trace output to file/ip. Maybe use something like the Common Trace Format (http://git.efficios.com/?p=ctf.git;a=blob_plain;f=common-trace-format-specification.md;hb=master <http://git.efficios.com/?p=ctf.git;a=blob_plain;f=common-trace-format-specification.md;hb=master>), instead of the term_to_binary that we have today.
> * Write much much better documentation for dbg :)
+1 on the suggestions, esp. multiple tracers. The one-tracer restriction makes it hard to explore some very interesting monitoring and debugging ideas.
I have had some success using an event/[2,3] function for lightweight debugging, where the functions event(Line, Event) and event(Line Event, State) exist only to be traced. A simple way to generalize this would be to create ‘null’ BIFs: erlang:trace_event(Info, Event) -> ok, erlang:trace_event(Info, Event, State) -> ok (just a trace_event/1 would of course suffice). If not traced, these functions do nothing; when traced, they emit trace info similar to io:format debugging.
See e.g. the locks application:
The ?debug/[1,2] macro is used to also include line numbers. One could perhaps imagine such a macro existing in dbg, but putting an event() function in dbg as well forces a remote call even when tracing is not enabled.
So basically, the features are:
- a ‘debug’ function that is as lightweight as possible when not traced, just existing for the purpose of being traced
- including line numbers
I’ve found this extremely useful not least when enabled during multi-node test suite execution, using ttb to merge logs and a final pretty-printing step to produce a text log best viewed with emacs's Erlang syntax highlighting. See https://github.com/uwiger/locks/blob/master/src/locks_ttb.erl <https://github.com/uwiger/locks/blob/master/src/locks_ttb.erl> (Logs fetched only if test case fails).
One might fantasize about enabling migration from io:format() debugging to this, ideally by simply rewriting the debug macro: a handy wrapper that simulates the original printouts. There are some problems with this, so I’ve chosen not to go there.
To further generalize the issue …
What we ended up with at Ericsson was a variety of techniques for getting the most out of tracing:
- A script that inserted a ?DBG macro after each ->. This was a conditional debug printout, but could have been a trace event as above. The feature was to log the ‘branching’ in the code, e.g. noting which case function or receive clause was actually selected.
- Consistently breaking out case clauses etc as separate functions (since function calls can be traced). See e.g. the diameter application in OTP for examples of this style of coding.
I think there is room for revisiting the ‘et’ concept in this context. The thing about ‘et’ was that it allowed for a ‘model-level’ trace, complete with sequence diagram presentation. Its main drawback was that few figured out how to use it. But I think the idea is brilliant: to be able to dynamically extract ‘model markers’ from executing code and displaying it in a way that can tie back to the conceptual design.
Lastly, TTB has been greatly improved (ahem…) and is extremely useful for multi-node tracing. When improving the DBG documentation, make sure to show how TTB and DBG work together and what their respective roles are.
Ulf Wiger, Co-founder & Developer Advocate, Feuerlabs Inc.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the erlang-questions