Lexing and parsing, was Re: plain_fsm - for beginners and purists

Thomas Lindgren <>
Wed Feb 11 19:36:46 CET 2004


--- Ulf Wiger <> wrote:

> Joe recently suggested a -compile({token_transform,
> Module}) directive,
> so that he could use "!!" without forcing everyone
> to hack the parser.

Interesting idea, but wouldn't the Module have to be
an entire tokenizer? Just processing a stream of
tokens (e.g., [... bang, bang, ...] => [..., bangbang,
...]) might lead to confusion: how to separate "!!"
from "! !"?

Anyway, it might be neat to have a "composable lexer",
where you, say, compose a sequence of tokenizer
behaviours into a full tokenizer. It seems fairly
straightforward to implement, at first blush, though
the obvious implementation would be slow.

(E.g., it seems difficult to avoid scanning characters
many times, if the NFAs are scattered among various
modules, and perhaps added and removed here and
there.)

In the same vein, it could be useful to have a
"composable parser" where you (handwave) just add the
parsing rules you'd like. Earley's algorithm can parse
any CFL, if memory serves, so it might be feasible.
Don't ask me to do it, though :-)

The resulting language would be ... er. Something
unique.

-- Thomas


__________________________________
Do you Yahoo!?
Yahoo! Finance: Get your refund fast by filing online.
http://taxes.yahoo.com/filing.html



More information about the erlang-questions mailing list