[erlang-questions] DTrace for Erlang

Thu Nov 15 11:51:01 CET 2007

Henry

> You should probably have a look at the trace
> functionality built in in Erlang(the trace bif with
> the dbg library). That extended with a few bits and
>  bobs would probably provide you with most of you need.
Thanks for the reply.
Would you please help me? What are the "few bits and bobs"?

I can see that the facilities of erlang:trace work within Erlang (but  
I have some questions to come), but I have been unable to understand  
how I get an end-to-end flow (e.g. user-oriented use-case) which  
passes through e.g. non-Erlang web servers, through Erlang and my  
Erlang-based services into none-Erlang services like databases via  
the OS kernel.

Specifically, how would I trace between Erlang applications, none- 
Erlang applications, and the kernel using those "few bits and bobs"  
so that I get a low-impact on the production system? (I am assuming I  
will need to filter-down the information so that I don't impact the  
production system too much).

Detail
--------
I can understand people here may feel that I am under-constraining  
the problem, but I expect projects to encompass a mixed set of  
technologies. I expect to have to deal with end-to-end use-cases for  
*all* parts of a system. I expect those use-cases will not be  
replaced with end-to-end Erlang in a single change-over event. At the  
Erlang User Conference, several examples were described like the  
Travel Angel, which will implement this mixed technology pattern, and  
they are non-trivial and so I think they'd benefit from tool support.

I do expect many of those change over events to come with some  
problems. To increase the chances of success, I would like tools  
which will help customers (and me!) deal with those mixed-technology  
environments after deployment, in production. I would go a bit  
further, and suggest in many cases, co-existance of Erlang and legacy  
technology is critical to enable transition and uptake of Erlang to  
be practical.

DTrace is a tool which can be made to work across those other  
technologies. I am willing to invest the effort to learn about other  
comparable, alternatives for a *mixed* technology environment which  
includes Erlang, but I don't see how that works right now.

Summary:
I feel strongly that Enterprise-class technologies (which I believe  
Erlang is) need to co-exist with 'legacy' technologies to increase  
the opportunities to introduce the improved (Erlang) technologies at  
tolerable levels of risk in a wide range of situations. I think  
supporting production tooling (like DTrace), for end-to-end  
production analysis, debugging and tuning is a facet of co-existance.

Garry

>> Is anyone working on building an Erlang DTrace provider?
>> I realise DTrace isn't available on all Erlang platforms, but now   
>> that DTrace is available Mac OS X (Leopard), as well as Solaris  
>> (and  FeeeBSD), I feel it might be worth doing.
>> About DTrace
>> -------------------
>> For those of you unfamiliar with DTrace, it is, basically, magic.
>> DTrace provides facilities to trace any program, and many aspects  
>> of  the OS kernel. It has several critical properties:
>> 1. A program does not need any changes (no need to compile for  
>> debug,  or anything like that). All existing programs work (to  
>> some extent).
>> 2. When not DTrace'ing, the cost of being DTrace-able is almost  
>> zero  (claimed < 0.5%) and this cost is already included (on  
>> Solaris & OS X).
>> 3. It is 'secure', it honours the access control mechanisms of  
>> the  host OS.
>> 4. DTrace can cross process boundaries, and trace the kernel  
>> itself  (if you have the appropriate security privileges)
>> These features allow DTrace to be used in *PRODUCTION*.
>> Put another way: it is straightforward to trace through a program  
>> in  a user process, back out through the kernel, then into other   
>> processes. Tracing can be activated *after* a program has been   
>> deployed and started *without* changing the program or restarting  
>> it.  DTrace overhead, when not tracing a program is (claimed to  
>> be)  essentially zero, and DTrace costs when activated are  
>> (claimed to be)  low.
>> DTrace is programmable, with a scripting language a bit like awk  
>> but  without loops; instead of awk text patterns, DTrace probes  
>> are  pattern matched to trigger script actions. Scripting lets  
>> you  correlate events across processes and the kernel, and scripts  
>> can  process the data so that 'noise' can be filtered out. For  
>> example, it  is feasible to trace from an incoming HTTP request  
>> through a web  server, through the kernel, to an application  
>> server, and back again  and correlate many concurrent flows. You  
>> might want to time that end- to-end flow, or gather intermediate  
>> timing, or watch for a particular  access to a specific file,  
>> or ..., and the scripting language and  functions are powerful  
>> enough to do that....

>> Why an Erlang DTrace provider?
>> ---------------------------------------
>> So I hear you ask, why build a DTrace provider for erl when  
>> DTrace  already works (on Solaris and Leopard) with erl?
>> Well, DTrace will show which functions are called in the erl  
>> program  (and which file descriptors and sockets are used, etc.),  
>> but will  *not* show you the Erlang program events directly.
>> ...
>> If I were building a large, complex system, availability of  
>> DTrace  would be an important consideration. I believe several  
>> companies  claim they moved to Solaris to get DTrace, and having  
>> used it for  simple debugging and tuning, I can believe that.
>> DTrace is so useful, that the erl developers and maintainers may  
>> find  it extremely useful for debugging, testing, and tuning  
>> Erlang itself.
>> Could I raise it as an EEP, or is it already on the TODO list?