[erlang-questions] xmerl_regexp "\w" character class

Tim Watson watson.timothy@REDACTED
Tue Jan 17 08:41:25 CET 2012


It does appear to have a completely custom regex engine, so it could be an
oversight. According to the code (as of R14B) you *can* use posix character
classes like [:alnum:] and [:alpha:] though.

2012/1/16 Ignas Vyšniauskas <baliulia@REDACTED>

> Hi Erlangers,
>
> why does xmerl_regexp not support the character class \w?
>
> 1> {ok, MatchW} = xmerl_regexp:setup("\\w").
> {ok,{comp_regexp,{{{c_state,1,none,none,none,none,none,
> <..>}}}
> 2> xmerl_regexp:match("a", MatchW).
> nomatch
> 3> {ok, MatchW2} = xmerl_regexp:setup("[a-z]").
> {ok,{comp_regexp,{{{c_state,1,none,none,none,none,none,
> <..>}}}
> 4> xmerl_regexp:match("a", MatchW2).
> {match,1,1}
>
> It supports \d for example:
>
> 5> {ok, MatchD} = xmerl_regexp:setup("\\d").
> {ok,{comp_regexp,{{{c_state,1,none,none,none,none,none,
> <..>}}}
> 6> xmerl_regexp:match("1", MatchD).
> {match,1,1}
>
> This causes XML validation to fail when I have the restriction in the
> schema <xsd:pattern value="\w{1,4}"/>
> I get: {error, [{pattern_mismatch,"WORD","\\w{1,4}"}]}.
>
> Of course I can change the restriction to
> <xsd:pattern value="[A-Za-z0-9_]{1,4}"/>
> but I wanted to make people aware of this issue.
>
> --
> Ignas
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://erlang.org/mailman/listinfo/erlang-questions
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20120117/e9025763/attachment.htm>


More information about the erlang-questions mailing list