[erlang-questions] Regexp Matching on Unicode

José Valim jose.valim@REDACTED
Tue Dec 13 12:14:00 CET 2016


Apologies, just after Hugo Mills' reply I noticed your question was related
to leex and not re.

leex does not support unicode character classes, such as \p or \w. It does
accept unicode as its input as well as unicode characters as literals in
your rules, such as [á-ú], the pound sign, etc.



*José Valim*
www.plataformatec.com.br
Skype: jv.ptec
Founder and Director of R&D

On Tue, Dec 13, 2016 at 11:40 AM, José Valim <
jose.valim@REDACTED> wrote:

> Make sure to escape the property escape character and to also pass the
> [unicode] flag when compiling and it should be good to go:
>
> 28> {ok, Reg} = re:compile("\\p{L}{5}", []).
> {ok,{re_pattern,0,0,0,
>                 <<69,82,67,80,77,0,0,0,0,0,0,0,1,0,0,0,255,255,255,255,
>                   255,255,...>>}}
> 29> re:run(<<"こんにちは"/utf8>>, Reg).
> nomatch
>
> 30> {ok, RegUni} = re:compile("\\p{L}{5}", [unicode]).
> {ok,{re_pattern,0,1,0,
>                 <<69,82,67,80,77,0,0,0,0,8,0,0,1,0,0,0,255,255,255,255,
>                   255,255,...>>}}
> 31> re:run(<<"こんにちは"/utf8>>, RegUni).
> {match,[{0,15}]}
>
>
>
>
> *José Valim*
> www.plataformatec.com.br
> Skype: jv.ptec
> Founder and Director of R&D
>
> On Tue, Dec 13, 2016 at 11:32 AM, Zachary Kessin <zkessin@REDACTED>
> wrote:
>
>> Hi All
>>
>> I am hitting a bit of a wall here, I am building a lexer with leex and I
>> really want to match on unicode chars, there is a regex class \p{Letter}
>> but that does not seem to work in erlang. I really want is a way to say
>> "Match a letter, but not a digit". So the \w would not work. Any ideas?
>>
>> --
>> Zach Kessin
>> SquareTarget <http://squaretarget.rocks?utm_source=email-sig>
>> Twitter: @zkessin <https://twitter.com/zkessin>
>> Skype: zachkessin
>>
>> _______________________________________________
>> erlang-questions mailing list
>> erlang-questions@REDACTED
>> http://erlang.org/mailman/listinfo/erlang-questions
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20161213/65aa7b62/attachment.htm>


More information about the erlang-questions mailing list