[erlang-questions] semantic tagger

Wed Apr 25 23:31:47 CEST 2007

Hi,

On 4/25/07, Joe Armstrong <erlang@REDACTED> wrote:
> Hello, has anybody got a "semantic tagger" that can tag Erlang source
> code files?

What do you mean by "syntactic tagger"?

> I want to convert a .erl file into a sequence of tokens
>
> [{Tag, String}]
>
> where Tag is a semantic tag (like comment, variable, atom, functionCall, etc.)
> that tags the following string.
>
> Constraint: If I concatenate all the strings in token list I should get the
> original file content. I want to preserve all input formatting.

I venture to guess that you mean "extended lexer", as the alternative
would be a parser.

There is something like that in ErlIDE, it returns even whitespace and
macros and not only the value of the token but its text (i.e. for
example {integer, 20, "16#14"}). It is a modified version of the
standard lexer and it doesn't return column position, but the offset
in the file (column position would be trivial to add).

The file is to be found from
http://erlide.svn.sourceforge.net/viewvc/erlide/trunk/org.erlide.core/erl/erlide_scan.erl?view=log

I think that the full semantic lexer would have to be a parser that
doesn't build an abstract tree, but a flat list of semantic tokens.
What semantic tokens would you need, except functionCall?

best regards,
Vlad