[erlang-bugs] leex extended regexp {m,n} quantifiers

Suraj N. Kurapati sunaku@REDACTED
Wed Oct 21 23:48:44 CEST 2015


Hello,

I'm using Erlang/OTP 18.1 ([erts-7.1] [source] [64-bit] [smp:24:24]
[async-threads:10] [hipe] [kernel-poll:false]) where I find that leex
does not honor extended regexp numerical repetition quantifiers [1].

For example, I'm trying to recognize exactly 11, 10, or 7 uppercase
alphanumeric characters as a product_id in a leex rule, as follows:

[A-Z0-9]{11}|[A-Z0-9]{10}|[A-Z0-9]{7}
           : {token, {product_id, TokenLine, TokenChars}}.

However, that didn't work, so I tried using AWK syntax [2] exactly:

[A-Z0-9]\{11\}|[A-Z0-9]\{10\}|[A-Z0-9]\{7\}
           : {token, {product_id, TokenLine, TokenChars}}.

However, that didn't work either, so I was forced to expand it out:

[A-Z0-9][A-Z0-9][A-Z0-9][A-Z0-9][A-Z0-9][A-Z0-9][A-Z0-9][A-Z0-9][A-Z0-9][A-Z0-9][A-Z0-9]|[A-Z0-9][A-Z0-9][A-Z0-9][A-Z0-9][A-Z0-9][A-Z0-9][A-Z0-9][A-Z0-9][A-Z0-9][A-Z0-9]|[A-Z0-9][A-Z0-9][A-Z0-9][A-Z0-9][A-Z0-9][A-Z0-9][A-Z0-9]
           : {token, {product_id, TokenLine, TokenChars}}.

This is a giant step backwards in terms of regexp readability and
maintainability, so please make leex honor the extended regexp
numerical repetition quantifiers [1] since AWK supports them [2].

Furthermore, it seems that this work had already been started [3]
but, for reasons unknown to me, was commented-out in the code [4].

Thanks for your consideration.

[1] http://pubs.opengroup.org/onlinepubs/009695399/basedefs/xbd_chap09.html#tag_09_03_06
[2] http://pubs.opengroup.org/onlinepubs/009695399/basedefs/xbd_chap09.html#tag_09_04
[3] https://github.com/erlang/otp/blob/maint/lib/parsetools/src/leex.erl#L138
[4] https://github.com/erlang/otp/blob/maint/lib/parsetools/src/leex.erl#L692-L700
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 949 bytes
Desc: OpenPGP digital signature
URL: <http://erlang.org/pipermail/erlang-bugs/attachments/20151021/a412162e/attachment.bin>


More information about the erlang-bugs mailing list