[erlang-questions] Adoption of perl/javascript-style regexp syntax
Ulf Wiger
ulf.wiger@REDACTED
Tue Jun 2 15:02:35 CEST 2009
Geoffrey Biggs wrote:
> Python provides a method of specifying strings they call "raw strings,"
> which I find quite interesting. Basically, you prefix your string with r
> or R, and any backslashes are treated as literal characters rather than
> escape sequences. For example:
>
> >>> '\b'
> '\x08'
> >>> r'\b'
> '\\b'
The problem is that this uses regular tokens and has a valid
parse scan result today:
4> erl_scan:string("r'\b'.").
{ok,[{atom,1,r},{atom,1,'\b'},{dot,1}],1}
To support it, one would have to make r' a token in its own
right, which *might* actually break existing code (albeit
unlikely) - or complicate the scanner by having it look ahead
in a form of quick parse in order to figure out whether this
is a string or not.
That was one reason why I went for the backtick. It's not
recognized by the parser today.
Another problem, of course, is that while the r'...' syntax
lets you write \ without escaping, it still has some issues
with escaping, which I find a bit unintuitive.
By contrast, the `P...P is pretty simple to understand (you
just have to pick a delimiter that doesn't show up in the
string - it could be `'foo', `&foo&, or whatever. The way I
wrote it, you couldn't pick \ or \n as the delimiter, although
\ would actually work, I guess... (a newline would work too, but
that I find unintuitive.)
BR,
Ulf W
--
Ulf Wiger
CTO, Erlang Training & Consulting Ltd
http://www.erlang-consulting.com
More information about the erlang-questions
mailing list