[erlang-questions] getting type info at runtime

Mon Sep 26 22:13:23 CEST 2011

2011/9/26 Motiejus Jakštys <desired.mta@REDACTED>

> On Mon, Sep 26, 2011 at 17:28, Jonas Boberg
> <jonas.boberg@REDACTED> wrote:
> > Hi,
> >
> > We solve this by having a parse transformation that adds additional
> > functions to the module (at compile time). More specifically, we use
> > this to generate functions that serialize and deserialize json objects
> > to records, including type verification based on the spec annotations.
>
> Hi,
>
> is this hosted somewhere?
>
> Motiejus Jakštys
>

Hi all,

Thanks for the responses - I'll answer a few questions and clarify. I should
have worded this question "getting type info from beam code compiled without
debug_info" instead. I'd like to be able to get the -spec info for several
reasons:

1. Discovery - think hoogle for Erlang packages
2. Static Analysis
3. Instrumentation

The idea of searching based on type information has come up on the list
before. Item (2) is obvious, and tools like dialyzer and QuickCheck/PropEr
are already using this type information to great effect. I just haven't
figured out how to unpick what they're doing properly.

As for (3), there are amazing capabilities built into erlang thanks to the
tracing capabilities in erts. I'm using these for diagnostic purposes, but I
don't see them as something that I'd like to have *always on* in my
production code. There are limitations to using a tracer process and of
course writing to a file or socket is safe(er) but there are limited
guarantees about what gets delivered and ordering and so on. Tracing feels
to me like something I turn on for diagnostic purposes. The kind of
instrumentation I'm after is more akin to something AOP, where I want to get
the build system to do some work for me but keep a tight control of what
code gets generated as it is now a proper part of my application.

The kind of post processing on the beam code I'd like to do might insert
things like logging statements or increment lightweight performance
counters. I'd like to do this as part of my build, so that I can

(a) pretty print the beam (using erl_pp) if I want to see what's been added
(b) make sure that the instrumented code is what gets unit/integration/load
tested

In order to apply these modifications to my own source code, I can use a
parse_transform and that's easy enough. I'd also like to have a declarative
approach for defining where the instrumentation/code-weaving takes place,
and type signatures seem like a good candidate for this. Let's say I want to
put a performance counter around calls to any function in any module which
has a name ending in _db, taking a #'myapp.query'{} record and an opaque
connection_handle() type. I might define the target operations (in my own
source code which is to be instrumented) by saying something like:

around('call(*_log:*(#'myapp.query'{}, connection_handle()))', Handler)
%% where Handler is an {M,F,A} or callback module conforming to a simple API

Now I suppose this is somewhat akin to a match specification {'_', '_', ...}
with some fancy stuff to handle the arguments such as {['$1', '$1',
'$1'], [{is_record, '$1', 'myapp.query', record_info(size, 'myapp.query')}],
....

So maybe match specifications would make a better language, but they are a
bit hard to follow (no offense to the good OTP folks intended) and type
specifications with wildcards for naming (modules and functions) seem a bit
simpler to me, at least for my purposes.

Now I *might* want to put a performance counter and/or logging statement (or
whatever) around calls to _someone else's code_ such as calls to
mochiweb_request or whatever, which I may not have the sources for and which
may not have been compiled with debug_info (or used the "abstract_code"
parse transform that's floating around). In those cases, it would be really
nice if the compiler would keep hold of the type signatures in the beam,
even when the abstract code isn't present. Of course in the meanwhile, if
the abstract code is present I can get it out using beam_lib, but the clever
people who wrote dialyzer/QuickCheck/PropEr must have solved this already.

So I'm wondering whether

1. The type introspection code it could be "unpicked" from these tools
easily - I will need pointing in the right direction though!
2. The compiler should preserve type information in the generated beam, and
if so is this in any roadmap or is a contribution the way to go?
3. Have I got the wrong impression of the trace facility or not

I see some kind of "type server" in both PropEr and Dialyzer and I'm pretty
sure if this was abstracted out into a separate component that it could
nicely form the core of a erlang-search web service or something of that
ilk.

I'd also be interested in this code you've mentioned Jonas, if it is open
source. For matching type specifications against modules/functions in other
people's code however, a parse_transform isn't what I need.

Please do call out better approaches if you have them, but also bare in mind
this is a mainly fun/educational exercise at the moment!

Cheers,

Tim
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20110926/f4e86e35/attachment.htm>