[erlang-questions] Atom Unicode Support

Fred Hebert mononcqc@REDACTED
Wed Feb 3 15:11:01 CET 2016


On 02/03, Pierre Fenoll wrote:
>What about re.erl character classes?
>
>I believe the regular expression [\s] does not match Unicode spaces, even when giving the unicode atom flag to re.erl functions.
>
>And there are other classes that Unicode defines that would be great for re.erl to support.

Pass in the `ucp' option:

ucp
    Specifies that Unicode Character Properties should be used when 
    resolving \B, \b, \D, \d, \S, \s, \W and \w. Without this flag, only 
    ISO-Latin-1 properties are used. Using Unicode properties hurts 
    performance, but is semantically correct when working with Unicode 
    characters beyond the ISO-Latin-1 range.



More information about the erlang-questions mailing list