[erlang-questions] getting type info at runtime

Mon Sep 26 23:58:00 CEST 2011

Hello!

Regarding types in .beam files: That information is only part of the abtract code, so dialyzer, hipe and all other tools need the abtract code or the source code to do its magic. The reason why it is stripped out is because the .beam files should be as small as possible by default and also partly because you cannot trust a typespec to be correct. 

Regarding instrumenting, I think that would be really great to have. There are already tools which can achieve something like this (meck for instance), but less obtrusive tools are always good :) (Meck renames a module to M_meck and then generates a new module which calls the mocked one in some places and the mocking code in others. This method could be used for live instrumentation, but it would not be desirable in a live system as it is neither atomic, not very maintainable...)

Describing where you want to put your instrumentation is harder to figure out though and to date no one really has managed to write a good wrapper to match specs. The best one (imo) is the dbg:fun2ms function Eg: 

dbg:fun2ms(fun([A,B,_]) when element(1,A) == record,element(2,B) == dict -> 
                message(caller()),exception_trace();
              ([_,B]) when B == debug -> 
                return_trace() 
           end).

Though I'm not sure how much clearer than the below it is: 

[{['$1','$2','_'],
  [{'==',{element,1,'$1'},record},
   {'==',{element,2,'$2'},dict}],
  [{message,{caller}},{exception_trace}]},
 {['_','$1'],[{'==','$1',debug}],[{return_trace}]}]

Using the improved -specs that will come in R14B04 it would be possible to write a spec which describes the same thing. Keeping things in Erlang has the advantage of it being easier to build tools around it though. 

Lukas

----- Original Message -----
From: "Tim Watson" <watson.timothy@REDACTED>
To: "Motiejus Jakštys" <desired.mta@REDACTED>
Cc: martynas@REDACTED, erlang-questions@REDACTED, cirka@REDACTED
Sent: Monday, September 26, 2011 10:13:23 PM
Subject: Re: [erlang-questions] getting type info at runtime

2011/9/26 Motiejus Jakštys < desired.mta@REDACTED > 

On Mon, Sep 26, 2011 at 17:28, Jonas Boberg 
< jonas.boberg@REDACTED > wrote: 
> Hi, 
> 
> We solve this by having a parse transformation that adds additional 
> functions to the module (at compile time). More specifically, we use 
> this to generate functions that serialize and deserialize json objects 
> to records, including type verification based on the spec annotations. 

Hi, 

is this hosted somewhere? 

Motiejus Jakštys 

Hi all, 

Thanks for the responses - I'll answer a few questions and clarify. I should have worded this question "getting type info from beam code compiled without debug_info" instead. I'd like to be able to get the -spec info for several reasons: 

1. Discovery - think hoogle for Erlang packages 
2. Static Analysis 
3. Instrumentation 

The idea of searching based on type information has come up on the list before. Item (2) is obvious, and tools like dialyzer and QuickCheck/PropEr are already using this type information to great effect. I just haven't figured out how to unpick what they're doing properly. 

As for (3), there are amazing capabilities built into erlang thanks to the tracing capabilities in erts. I'm using these for diagnostic purposes, but I don't see them as something that I'd like to have *always on* in my production code. There are limitations to using a tracer process and of course writing to a file or socket is safe(er) but there are limited guarantees about what gets delivered and ordering and so on. Tracing feels to me like something I turn on for diagnostic purposes. The kind of instrumentation I'm after is more akin to something AOP, where I want to get the build system to do some work for me but keep a tight control of what code gets generated as it is now a proper part of my application. 

The kind of post processing on the beam code I'd like to do might insert things like logging statements or increment lightweight performance counters. I'd like to do this as part of my build, so that I can 

(a) pretty print the beam (using erl_pp) if I want to see what's been added 
(b) make sure that the instrumented code is what gets unit/integration/load tested 

In order to apply these modifications to my own source code, I can use a parse_transform and that's easy enough. I'd also like to have a declarative approach for defining where the instrumentation/code-weaving takes place, and type signatures seem like a good candidate for this. Let's say I want to put a performance counter around calls to any function in any module which has a name ending in _db, taking a #'myapp.query'{} record and an opaque connection_handle() type. I might define the target operations (in my own source code which is to be instrumented) by saying something like: 

around('call(*_log:*(#'myapp.query'{}, connection_handle()))', Handler) 
%% where Handler is an {M,F,A} or callback module conforming to a simple API 

Now I suppose this is somewhat akin to a match specification {'_', '_', ...} with some fancy stuff to handle the arguments such as {['$1', '$1', '$1'], [{is_record, '$1', 'myapp.query', record_info(size, 'myapp.query')}], .... 

So maybe match specifications would make a better language, but they are a bit hard to follow (no offense to the good OTP folks intended) and type specifications with wildcards for naming (modules and functions) seem a bit simpler to me, at least for my purposes. 

Now I *might* want to put a performance counter and/or logging statement (or whatever) around calls to _someone else's code_ such as calls to mochiweb_request or whatever, which I may not have the sources for and which may not have been compiled with debug_info (or used the "abstract_code" parse transform that's floating around). In those cases, it would be really nice if the compiler would keep hold of the type signatures in the beam, even when the abstract code isn't present. Of course in the meanwhile, if the abstract code is present I can get it out using beam_lib, but the clever people who wrote dialyzer/QuickCheck/PropEr must have solved this already. 

So I'm wondering whether 

1. The type introspection code it could be "unpicked" from these tools easily - I will need pointing in the right direction though! 
2. The compiler should preserve type information in the generated beam, and if so is this in any roadmap or is a contribution the way to go? 
3. Have I got the wrong impression of the trace facility or not 

I see some kind of "type server" in both PropEr and Dialyzer and I'm pretty sure if this was abstracted out into a separate component that it could nicely form the core of a erlang-search web service or something of that ilk. 

I'd also be interested in this code you've mentioned Jonas, if it is open source. For matching type specifications against modules/functions in other people's code however, a parse_transform isn't what I need. 

Please do call out better approaches if you have them, but also bare in mind this is a mainly fun/educational exercise at the moment! 

Cheers, 

Tim 
_______________________________________________
erlang-questions mailing list
erlang-questions@REDACTED
http://erlang.org/mailman/listinfo/erlang-questions