[erlang-questions] DTrace for Erlang
G Bulmer
gbulmer@REDACTED
Tue Nov 13 19:09:23 CET 2007
Is anyone working on building an Erlang DTrace provider?
I realise DTrace isn't available on all Erlang platforms, but now
that DTrace is available Mac OS X (Leopard), as well as Solaris (and
FeeeBSD), I feel it might be worth doing.
About DTrace
-------------------
For those of you unfamiliar with DTrace, it is, basically, magic.
DTrace provides facilities to trace any program, and many aspects of
the OS kernel. It has several critical properties:
1. A program does not need any changes (no need to compile for debug,
or anything like that). All existing programs work (to some extent).
2. When not DTrace'ing, the cost of being DTrace-able is almost zero
(claimed < 0.5%) and this cost is already included (on Solaris & OS X).
3. It is 'secure', it honours the access control mechanisms of the
host OS.
4. DTrace can cross process boundaries, and trace the kernel itself
(if you have the appropriate security privileges)
These features allow DTrace to be used in *PRODUCTION*.
Put another way: it is straightforward to trace through a program in
a user process, back out through the kernel, then into other
processes. Tracing can be activated *after* a program has been
deployed and started *without* changing the program or restarting it.
DTrace overhead, when not tracing a program is (claimed to be)
essentially zero, and DTrace costs when activated are (claimed to be)
low.
DTrace is programmable, with a scripting language a bit like awk but
without loops; instead of awk text patterns, DTrace probes are
pattern matched to trigger script actions. Scripting lets you
correlate events across processes and the kernel, and scripts can
process the data so that 'noise' can be filtered out. For example, it
is feasible to trace from an incoming HTTP request through a web
server, through the kernel, to an application server, and back again
and correlate many concurrent flows. You might want to time that end-
to-end flow, or gather intermediate timing, or watch for a particular
access to a specific file, or ..., and the scripting language and
functions are powerful enough to do that.
If you are interested look at http://www.sun.com/bigadmin/content/
dtrace/
There is an outline about how to add probes to an application here:
http://docs.sun.com/app/docs/doc/817-6223/chp-usdt
Here is a little bit about providers and naming the probes: http://
www.solarisinternals.com/wiki/index.php/DTrace_Topics_Providers
Here's an article: http://www.devx.com/Java/Article/33943 (trying
google "DTrace Java examples")
Here's more DTrace bloggyness at the 'Dtrace Three': http://
blogs.sun.com/ahl/ http://blogs.sun.com/bmc/ http://blogs.sun.com/mws/
I believe there is work on Ruby, Perl and Python DTrace providers too.
Why an Erlang DTrace provider?
---------------------------------------
So I hear you ask, why build a DTrace provider for erl when DTrace
already works (on Solaris and Leopard) with erl?
Well, DTrace will show which functions are called in the erl program
(and which file descriptors and sockets are used, etc.), but will
*not* show you the Erlang program events directly.
Java 6 implement a DTrace provider, which surfaces Java method calls,
class loading/unloading, the garbage collector, concurrency monitors
etc. (rather than the details of the JVM, which are available anyway).
So, I am suggesting an erl DTrace provider would show similar things,
specifically:
- module loading/unloading,
- function calls (and maybe exits),
- garbage collection events,
- Erlang-process state/context-switching,
- message send/receive.
This would likely be sufficient for most purposes when combined with
the existing DTrace support for tracing TCP and UDP sockets, file
access etc from the erl process. Mac OS X 'Leopard' comes with a
fancy GUI to show DTrace events, so that would give something like
the new Percept but for many different types of event, not just
process state.
If I were building a large, complex system, availability of DTrace
would be an important consideration. I believe several companies
claim they moved to Solaris to get DTrace, and having used it for
simple debugging and tuning, I can believe that.
DTrace is so useful, that the erl developers and maintainers may find
it extremely useful for debugging, testing, and tuning Erlang itself.
Could I raise it as an EEP, or is it already on the TODO list?
Garry
More information about the erlang-questions
mailing list