[erlang-questions] Regexp Matching on Unicode

Hugo Mills <>
Tue Dec 13 11:41:50 CET 2016


On Tue, Dec 13, 2016 at 12:32:43PM +0200, Zachary Kessin wrote:
> Hi All
> 
> I am hitting a bit of a wall here, I am building a lexer with leex and I
> really want to match on unicode chars, there is a regex class \p{Letter}
> but that does not seem to work in erlang. I really want is a way to say
> "Match a letter, but not a digit". So the \w would not work. Any ideas?

   I think if you want unicode support, you need to write your own
lexer, or use something other than leex. It's a bit limited in what it
supports. I went through this earlier this year, and ended up writing
my own -- partly for that reason, and partly to do with the way I
wanted to process block comments.

   Hugo.

-- 
Hugo Mills             | "There's more than one way to do it" is not a
 carfax.org.uk | commandment. It is a dire warning.
http://carfax.org.uk/  |
PGP: E2AB1DE4          |
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: Digital signature
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20161213/f9b98cad/attachment.bin>


More information about the erlang-questions mailing list