<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">Thank you Richard for the historical perspective.</div><div class="gmail_quote"><br></div><div class="gmail_quote">On Wed, Apr 22, 2015 at 2:35 AM, Richard A. O'Keefe <span dir="ltr"><<a href="mailto:ok@cs.otago.ac.nz" target="_blank">ok@cs.otago.ac.nz</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Erlang syntax is adapted from Prolog syntax.<br>

It is traditional in Prolog parsers to distinguish<br>

between a "." token such as you might find in<br>

a.b.[] (the really old-fashioned way to write a list)<br>

and a ". " token which ends a clause.<br>

And it turns out that erl_scan:string/3 makes exactly<br>

the same distinction:  "a. b" contains a ". " (dot) token<br>

while "a.b" contains a "." ('.') token.  Now a full stop<br>

at the end of a string is also a dot token, but it has<br>

text ".".  </blockquote><div><br></div><div>And a ".%" input also creates a dot token with a "." text. </div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">White space as such is never a token.<br></blockquote><div><br></div><div>It is if the scanner receives a 'return_whitespace" option. This is needed (together with the 'text' option) if one must be able to recreate the original source exactly as it was. </div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

This does not look like a bug at all to me.<br></blockquote><div><br></div><div>I can agree that it is a stretch to call it a bug, but it should be better specified. It is unexpected that some whitespace, especially newline, is part of the 'dot' token. I would have left it separate and had a special case for the opposite operation, i.e. add a whitespace after a 'dot' when reconstructing the source (if the token list doesn't already include whitespace tokens).</div><div><br></div><div>best regards,</div><div>Vlad</div><div><br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

I will say that it would be nice if the<br>

<a href="http://www.erlang.org/doc/man/erl_scan.html" target="_blank">http://www.erlang.org/doc/man/erl_scan.html</a><br>

page contained or linked to an explicit statement<br>

of what the tokens ARE.<br>

<br>

At a minimum, the type category() should be a bit<br>

more explicit than "atom()".<br>

<br>

For that matter,<br>

<a href="http://www.erlang.org/doc/man/erl_parse.html" target="_blank">http://www.erlang.org/doc/man/erl_parse.html</a><br>

should contain or link to an explicit statement<br>

of what the grammar is.<br>

<br>

<br>

</blockquote></div><br></div></div>