[erlang-questions] xmerl_regexp "\w" character class

Ignas Vyšniauskas <>
Mon Jan 16 23:31:39 CET 2012


Hi Erlangers,

why does xmerl_regexp not support the character class \w?

1> {ok, MatchW} = xmerl_regexp:setup("\\w").
{ok,{comp_regexp,{{{c_state,1,none,none,none,none,none,
<..>}}}
2> xmerl_regexp:match("a", MatchW).
nomatch
3> {ok, MatchW2} = xmerl_regexp:setup("[a-z]").
{ok,{comp_regexp,{{{c_state,1,none,none,none,none,none,
<..>}}}
4> xmerl_regexp:match("a", MatchW2).
{match,1,1}

It supports \d for example:

5> {ok, MatchD} = xmerl_regexp:setup("\\d").
{ok,{comp_regexp,{{{c_state,1,none,none,none,none,none,
<..>}}}
6> xmerl_regexp:match("1", MatchD).
{match,1,1}

This causes XML validation to fail when I have the restriction in the
schema <xsd:pattern value="\w{1,4}"/>
I get: {error, [{pattern_mismatch,"WORD","\\w{1,4}"}]}.

Of course I can change the restriction to
<xsd:pattern value="[A-Za-z0-9_]{1,4}"/>
but I wanted to make people aware of this issue.

--
Ignas



More information about the erlang-questions mailing list