Regular Expressions Problems
Gordon Guthrie
gordon@REDACTED
Mon Apr 19 14:03:13 CEST 2010
Folks
I think I may have identified a regular expression bug in re.
The following code never terminates in R13B-04:
-module(fail).
-export([fail/0]).
fail() ->
Str = "http:/www.flickr.com/slideShow/index.gne?group_id=&user_id=69845378@REDACTED",
EMail_regex = "[a-z0-9!#$%&'*+/=?^_`{|}~-]+"
++ "(?:\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*"
++ "@(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+"
++ "(?:[a-zA-Z]{2}|com|org|net|gov|mil"
++ "|biz|info|mobi|name|aero|jobs|museum)",
io:format("about to run...~n"),
Ret = re:run(Str, EMail_regex),
io:format("Ret is ~p~n", [Ret]).
Eliminating the @ in either the string or the regex and it will
terminate - but if you don't it wont...
There is a comment about the behaviour of '@' in Perl regular
expressions in the docos:
> If you want to remove the special meaning from a sequence of characters, you can do so by putting them between \Q and \E.
> This is different from Perl in that $ and @ are handled as literals in \Q...\E sequences in PCRE, whereas in Perl, $ and @ cause variable interpolation.'
Jeremy Zawinski's famous comment springs to mind:
> Some people, when confronted with a problem, think "I know, I'll use regular expressions." Now they have two problems.
Gordon
More information about the erlang-questions
mailing list