[erlang-questions] Erlang-DTrace progress report: Erlang-DTrace BIFs

Sat Mar 8 21:39:01 CET 2008

Sorry for the slow progress. I've had a little time to work on Erlang- 
DTrace recently.

I am posting this to both Erlang and DTrace groups, so I apologise if  
somethings are obvious to you.

(For Erlang newbies: Erlang has massive advantages over traditional  
application technology because the Erlang VM can load new code without  
stopping, or quiescing the applications executing on the Erlang VM.)

(For DTrace newbies: DTrace observation can be production systems, on- 
demand, and in 'real time' from process memory, rather than via file.)

There are many pieces to Erlang-DTrace. This piece is about explicit  
support of DTrace from Erlang; Erlang-DTrace BIFs.

The concept is to enable Erlang applications to explicitly choose to  
expose data to DTrace by providing hooks in Erlang to use DTrace  
probes. The hooks are provided as new Erlang Built-in-Functions  
(BIFs). When the DTrace probes are 'off' there is very little  
overhead. The Erlang program must explicitly use these BIFs. This  
isn't the only part of Erlang-DTrace, but is quite useful.

DTrace probes can be enabled, used, and disabled without interacting  
with Erlang VMs. DTrace can monitor many VM's, or just one.
Further, DTrace can correlate behaviour within the OS kernel and  
across other applications (e.g. Apache, MySQL, Firefox, etc.), so  
please read with that multi-application+kernel view in mind, rather  
than comparing with Erlangs own powerful trace facilities.

Anyway ...

I've added these Erlang BIFs as an experiment:

erlang:is_dtrace_on()  -> bool()
	Returns true if dtrace is observing probe erlang:::dtrace
erlang:is_dtrace_off()  -> bool()
	Returns false if dtrace is observing probe erlang:::dtrace

erlang:dtrace(Binary) -> binary()
	triggers the probe erlang:::dtrace passing the Erlang process ID  
(Pid), and the binary data and size of Binary.
	erlang:dtrace returns the original Binary as it's value, so that it  
can be easily inserted into existing code, e.g.
		binary_to_list(B)
	 can become
		binary_to_list(erlang:dtrace(B))
	without changing it's meaning.

The signature of the DTrace probe is:
         erlang:::dtrace(int Pid, char* binary_data, int  
binary_data_size)

These BIFs can be called like any normal erlang function, from normal  
Erlang code.

----- Example ----
( Here is a little dtrace script (with some macros to encapsulate  
extracting the fields of an Erlang process id):
#define PID_NODE(ePid) ((ePid >> 19) & 0x1fff)
#define PID_LPID(ePid) ((ePid >> 4) & 0x7ffff)
#define PID_SER(ePid) ((ePid >>2) & 0x03)

erlang*:::dtrace
{
     printf("erlang* dtrace:");
     printf(" pid=<%d.%d.%d> (arg0=0x%x)",
         PID_NODE(arg0), PID_LPID(arg0), PID_SER(arg0), arg0);

     s = stringof(copyin(arg1, arg2));
     printf(" arg1='%s' (size:%d)\n", s, arg2);
}

( Here is an Erlang session, in this case the dtrace script was stared  
before Erlang, and observes all Erlang Nodes (VMs) started, but it  
would still work correctly when Erlang Nodes (VMs) are already  
running, and this dtrace script is started when required:

gb$ bin/erl
Erlang (BEAM) emulator version 5.6 [source] [smp:2] [async-threads:0]  
[kernel-poll:false]

Eshell V5.6  (abort with ^G)
1> erlang:is_dtrace_on().
true
2> B = <<"<?xml wibble>">>.
<<"<?xml wibble>">>
3> erlang:dtrace(B).
<<"<?xml wibble>">>
4> erlang:dtrace(<<"HTTP 1.0">>).
<<"HTTP 1.0">>
5> erlang:dtrace(erlang:dtrace(B)).
<<"<?xml wibble>">>
6>

( Here's the output of the DTrace session (with blank lines removed):
gb$ ./tests.sh
dtrace: script 'test1.d' matched 0 probes
CPU     ID                    FUNCTION:NAME
   0  21486                  dtrace_1:dtrace erlang* dtrace:  
pid=<0.31.0> (arg0=0x1f3) arg1='<?xml wibble>' (size:13)
   0  21486                  dtrace_1:dtrace erlang* dtrace:  
pid=<0.31.0> (arg0=0x1f3) arg1='HTTP 1.0' (size:8)
   0  21486                  dtrace_1:dtrace erlang* dtrace:  
pid=<0.31.0> (arg0=0x1f3) arg1='<?xml wibble>' (size:13)
   0  21486                  dtrace_1:dtrace erlang* dtrace:  
pid=<0.31.0> (arg0=0x1f3) arg1='<?xml wibble>' (size:13)

( tests.sh is a little wrapper to get the options correct:
sudo dtrace -C -Z -s test1.d

%%----
So, if an Erlang application calls the erlang:dtrace/1 BIF, the binary  
data is exposed to DTrace if the probe is enabled by dtrace.
dtrace can be run at any time, and receives data from the probe *only*  
while dtrace is running, and the probe enabled.

When the dtrace probes are not disabled (not active), there is little  
overhead (a single BIF call).
The dtrace script can do whatever it likes with the data, including  
correlate across multiple Erlang VM's (Nodes) and other OS processes.
Currently, the parameter to erlang:dtrace/1 must be an Erlang Binary,  
which suites my use case; sockets send and receive Binary data (a byte  
sequence).

This is pre-pre*-alpha-quality.
It's not integrated into the Erlang build system, I have to do a  
'dtrace -h -s dtrace_probes.d' by hand.
The code is in existing erlang source files, and should be factored out.
Ideally, is_dtrace_on() and is_dtrace_off() should be allowed in guard  
expressions, as part of function/case pattern matching to make them  
more convenient and useful.

G Bulmer