[erlang-questions] unicode chardata and re:run
Johannes Weißl
jargon@REDACTED
Wed May 2 15:09:37 CEST 2012
Hello List!
I have a question regarding the regular expression module:
The man page [1] says re:run/3 is accepting a unicode:charlist() [2] as
Subject/RE when the "unicode" option is supplied. However, using the
function with a UTF-8 binary also works:
match = re:run(<<"foo">>, <<"f.o">>, [{capture, none}, unicode]).
I even found a test case which relies on re:run/3 accepting
unicode:chardata() (charlist + unicode_binary) in re_SUITE.erl [3].
Does this mean I can rely on re:run/3 accepting binaries (in this case
the documentation should be changed), or does re:run/3 only accept
charlists (in this case the test case needs to be changed)?
I found a post from 2010 [4] in which the first option is suggested.
[1] http://www.erlang.org/doc/man/re.html#run-3
[2] http://www.erlang.org/doc/man/unicode.html#type-charlist
[3] https://github.com/erlang/otp/blob/master/lib/stdlib/test/re_SUITE.erl#L295
[4] http://erlang.org/pipermail/erlang-patches/2010-January/000697.html
Greetings,
Johannes
More information about the erlang-questions
mailing list