[erlang-questions] semantic tagger

Ulf Wiger (TN/EAB) <>
Thu Apr 26 09:23:32 CEST 2007

... actually, what I ended up doing in CCviewer was to collect
and comments in a list between each real token. Thus, the token stream
[Tok1, Whitespace1, Tok2, Whitespace2 | ...]
That wasn't too bad, but to make it a bit more interesting, I also
not only to preserve formatting, but also do a decent job on code that 
might not compile (I would not do that again, though...)
When I first wrote the html:izer, I was into experimenting with doing
brunt of the work in function head patterns. This led to various
as well, but was a good learning experience.
Here's an example of what it could look like. The purpose was to convert
to HTML with hypertext links on function calls, function heads (that
list the callers of the function), record- and macro references.
(If the pretty-printer got confused, it would throw an exception, and
un-annotated text would be displayed instead.)
expr1([_T1={symbol, L1, C1, Ce1, '#'},  WC1,
       _T2={symbol, L2, C2, Ce2, '?'},  WC2,
       _T3={Tag,    L3, C3, Ce3,  W},   WC3,
       _T4={symbol, L4, C4, Ce4, '.'}|Ts]?Xs, Cur, L, Term,
      XRefs, FF, FA, S) when ?w_or_a(Tag) ->
    %% Hellish combination of macro expansion and record selector syntax
    %% We'd like to hypertext link both, but can't do that. Since we
    %% expand the macro, we'll create a hypertext reference to the
    %% We also have to consume the dot in order to keep the parser from 
    %% derailing.
    {Ref, Link} =
        case ets:lookup(S#state.tab, {define, W}) of
            [] ->
                %% hmmm...
                Ref1 = {mfa, S#state.modulename, W, ?macro_arity_int,
                FL1 = funlink(W, ?macro_arity_int, W),
                {Ref1, FL1};
            [{_, F, _, IncMod, 0}] ->
                Ref2 = {mfa, IncMod, W, ?macro_arity_int, L3},
                FL2 = macrolink(F, W, S),
                {Ref2, FL2}
    Out = [space(L, Cur, L1, C1, S),
           "#",                 wc(WC1,L1,Ce1,L2,C2, S),
           "?",                 wc(WC2,L2,Ce2,L3,C3, S),
           Link,                wc(WC3,L3,Ce3,L4,C4, S),
    S1 = out(Out, S),
    expr(Ts, Ce4, L4, Term, [Ref|XRefs], FF, FA, S1 ?LL);

Ulf W


[mailto:] On Behalf Of Ulf Wiger
	Sent: den 25 april 2007 18:35
	To: Joe Armstrong
	Cc: Erlang
	Subject: Re: [erlang-questions] semantic tagger

	I did it to some extent in CCviewer, but I wouldn't recommend
reusing the code...
	I think that for starters, the token scanner needs to preserve
column information.
	I think the standard tokenizer should have an option to do this.

	Ulf W
	2007/4/25, Joe Armstrong <>: 

		Hello, has anybody got a "semantic tagger" that can tag
Erlang source
		code files?
		I want to convert a .erl file into a sequence of tokens
		[{Tag, String}]
		where Tag is a semantic tag (like comment, variable,
atom, functionCall, etc.) 
		that tags the following string.
		Constraint: If I concatenate all the strings in token
list I should get the
		original file content. I want to preserve all input
		Has anybody done this?
		erlang-questions mailing list

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20070426/a37bd929/attachment.html>

More information about the erlang-questions mailing list