[erlang-questions] erl_scan issues

Tue Apr 21 17:08:11 CEST 2015

On Tue, Apr 21, 2015 at 4:53 PM, Anthony Ramine <n.oxyde@REDACTED> wrote:

> Le 21 avr. 2015 à 14:15, Vlad Dumitrescu <vladdu55@REDACTED> a écrit :
> > The dot is recognized anyway
>
> Not always.
>
> f() -> #foo.bar
> g() -> 3.14

What I mean is that the dot token doesn't have to contain the whitespace in
it in order to get recognized. Like for the % character, the whitespace
doesn't need to get consumed and will be part of the next token. This will
give more consistency too, as the newlines are otherwise always first in a
white_space token.

After digging in the code, I believe that I have an explanation, but am not
sure if it's correct. It seems that this is an artifact of how the
reentrant lexer (erl_scan:tokens) is implemented. Usually the end of the
input is 'eof', but if the input is a string we don't know if the input is
finished and we should return 'ok', or if there may be more input and
return 'more'. Having the string end with ". " (dot space) will have the
implementation return 'ok'. This is even mentioned in the docs for
erl_scan:tokens. I would prefer for such cases to end the input with 'eof'
(String ++ eof) which is already done in several places in erl_scan, in the
compiler, in dialyzer.

regards,
Vlad
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20150421/e439a5f7/attachment.htm>