compile/2 ignores unicode flag

Rory Byrne <>
Thu Jan 14 23:49:40 CET 2010


Hi,

I ran into problems using re:split with unicode input. The problem
appears to be with the re:compile BIF which re:split calls 
internally. 


Erlang R13B04 (erts-5.7.5) [source] [64-bit] [smp:2:2] [rq:2] [async-threads:0] [hipe] [kernel-poll:false]

Eshell V5.7.5  (abort with ^G)    
1>                               
1> % No problem with re:run
1> re:run("h\x{20AC}llo", "(.)", [unicode, global]).
{match,[[{0,1},{0,1}],
        [{1,3},{1,3}],
        [{4,1},{4,1}],
        [{5,1},{5,1}],
        [{6,1},{6,1}]]}
2> 
2> % Problem with compiled version
2> {ok, MP} = re:compile("(.)", [unicode]).
{ok,{re_pattern,1,0,
                <<69,82,67,80,64,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,...>>}}
3> re:run("h\x{20AC}llo", MP, [global]).
** exception error: bad argument
     in function  re:run/3
        called as re:run([104,8364,108,108,111],
                         {re_pattern,1,0,
                                     <<69,82,67,80,64,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,...>>},
                         [global])
4> 
4> % Also causes problems for re:split
4> re:split("h\x{20AC}llo", "(.)", [unicode, {return, list}]).
[[],"h",[],
 {incomplete,[],<<"h">>},
 [],
 {error,[],<<"h">>},
 [],
 {error,[],<<"h">>},
 [],"l",[],"l",[],"o",[]]
5> 


Cheers,

Rory


More information about the erlang-bugs mailing list