[erlang-questions] Reading comments/annotations inside parse_transforms

Thu Jul 12 10:29:54 CEST 2012

On 10 Jul 2012, at 20:20, Richard Carlsson <carlsson.richard@REDACTED> wrote:

> On 07/05/2012 11:10 AM, Tim Watson wrote:
>> There doesn't appear to be any way of doing this currently. There is
>> support in erl_syntax for working with these and there is also an
>> erl_comment_scan module in the syntax-tools application which parses
>> and returns comments from a file (or whatever). What there doesn't
>> appear to be, is a means to get erl_parse to return comments as
>> attributes (?) in the AST that the compiler passes to a
>> parse_transform.
>> 
>> My question then, is this: what is the best (only!?) way to process
>> comments at the same time as source code in a parse_transform? So far,
>> given that this seems unlikely to work OOTB, I'm assuming I'll have to
>> do something like:
>> 
>> 1. once you hit the 'file' attribute, parse the comments and stash
>> these away in a useful place
>> 2. process the forms as usual, comparing each line number for each
>> form, against the line numbers stored for the comments
>> 3. when you get an interleaving of code and comments, you know where
>> the comments reside in the source file in relation to the form you're
>> currently working with
>> 
>> I *can* do this of course, but it seems like an awful lot of hard
>> work. Why can't comments be preserved in the AST passed to a parse
>> transform, and then just dropped later on!? Is there an easier way of
>> doing this that I'm currently missing?
> 
> The compiler toolchain just isn't targeted at preserving comments; they are treated as whitespace and are discarded already at the tokenization stage (erl_scan), even before the preprocessing stage (epp). After that, there's the parsing stage, and then the parse transforms are called. So, as you said, you'll need to use the -file("...") hints to locate the source files and read them again to extract the comments.
> 
> It would certainly be possible to make a compiler that preserves comments for later passes, but the way it's currently done is the "traditional" way of writing a compiler, and changing it afterwards can be difficult, since all the code that expects the current behaviour has to be updated.
> 
> The good news is that the work of digging out comments and connecting them to the abstract syntax trees has already been done, in the syntax_tools library - EDoc uses exactly this (although not as a parse transform). You can call erl_comment_scan:file/1 to get the comment lines, and then use erl_recomment:recomment_forms/2 to attach the comments to the abstract syntax trees that the parse transform got. The result is a extended abstract syntax tree as defined by the erl_syntax module. You can use the erl_syntax functions to manipulate the tree, and when you're done, you need to call erl_syntax:revert/1 to revert the representation to the form that the compiler understands (this loses the comments again). For a detailed usage example, see edoc_extract:source/3 and friends.
> 
>    /Richard

Thanks Richard, that's a great deal better than what I had in mind!