[erlang-questions] Leex - a lexical anaylzer generator, released

Robert Virding <>
Wed May 28 01:56:40 CEST 2008


2008/5/27 Ben Hood <>:

> Robert,
> On 24 May 2008, at 16:39, Robert Virding wrote:
>
>  Apart from this there are no other noticeable changes. The Erlang and LFE
>> token syntaxes are included as examples.
>>
>
> This may sound like a stupid question, but how do you use it?
>
> In the documentation, it says that you can tokenize a stream by calling
>
> io:request(InFile, {get_until,Prompt,Module,tokens,[Line]})
>
> and in the old example tutorial which uses leex:gen/2 to create the lexer,
> its uses the following:
>
> io:requests(Stream, [{get_until,foo,Module, tokens,[LineNo]}])
>
> to pass of the stream to the parser.
>
> This doesn't seem to work with the new version of leex and I can't seem to
> find any documentation about io:request/1,2 or io:requests/1,2


First things first: there is no documentation on io:request/1/2 and
io:requests/1/2. They are sort of internal functions that interface the io
system to scan input. They do the same thing excepts that with requests you
can give a list of requests, both input and output. No, don't ask there is
no documentation about the io system either. At least I don't think so. Tell
me if I am wrong.

It seems as if I have saved my summer from the long easy days of doing
nothing. :-)

Leex is compliant with the io system and its functions work with it.
Basically when you process an xrl file you get an erlang module with 3 main
interface functions:

string/1/2 which takes a string of characters and input and tries to
tokenise it. All or nothing.

token/2/3 takes enough characters from an io stream to return one token.

tokens/2/3 takes enough characters to tokenise upto and including an
end_token (see new leex.txt and erlang_scan.xrl) or to the end of input.

Assume we have the file mysynt.xrl, running this through leex and then
compiling the erl file will generate the module mysynt. You can then do:

{ok,InFile} = file:open("input.file", [read]),
{ok,Toks,EndLine} = io:request(InFile,
{get_until,prompt,mysynt,tokens,[1]}),

which opens the file "input.file" for reading and tries to tokenise the
whole file (the tokens function). If it succeeds then Toks will contain the
list of all tokens and EndLine will be the line number of the of input. N.B.
the prompt 'prompt' is ginored for files but must be there. If you just want
to read one token you would do:

{ok,Tok,EndLine} = io:request(InFile, {get_until,prompt,mysynt,token,[1]}),

which will read one token.

If you have built a parser with yecc into the module myparse you could then
parse the input tokens by calling:

myparse:parse(Toks)

N.B. a yecc parser assumes that you give it a list containing all the tokens
it needs, and no more, in the call. This is the reason for the tokens/1/2
call from leex, '.' in erlang is an end_token. There is a way to get a yecc
parser to get more tokens by itself but the interface to it is broken and
sucks.


> Do you have a example that demonstrates glue code to create the lexer and
> parser and then interpret a stream using the current code base?


That was it. You can check lfe_io.erl in the LFE and look at the
parse_file/1 and read_file/1 functions. I use my own parser as I need it to
return remaining tokens so I can read one sexpr at a time. Yecc will not do
this, although it wouldn't be difficult to get to do it. :-(

I will try to write a Rationale which describes the io system, and other
core features, in more detail and how it all hangs together.

I hope that helped a little. Maybe others could point you to more code using
leex and parsers.

Robert
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20080528/5914ad37/attachment.html>


More information about the erlang-questions mailing list