[erlang-questions] Regular Expressions Problems

pan+eq@REDACTED pan+eq@REDACTED
Tue Apr 20 12:28:47 CEST 2010


Yes! You are absolutely right. You definitely saved me some time debugging 
this! A huge THANKS!

I don't know why I did reset the variable, it seems like a silly thing to 
do now that You've pointed out the error :)

I'll add the patch to GIT with you as the author if that's OK.

/Patrik


On Mon, 19 Apr 2010, Michael Santos wrote:

> On Mon, Apr 19, 2010 at 01:03:13PM +0100, Gordon Guthrie wrote:
>> Folks
>>
>> I think I may have identified a regular expression bug in re.
>>
>> The following code never terminates in R13B-04:
>>
>> -module(fail).
>>
>> -export([fail/0]).
>>
>> fail() ->
>>       Str = "http:/www.flickr.com/slideShow/index.gne?group_id=&user_id=69845378@REDACTED",
>>       EMail_regex = "[a-z0-9!#$%&'*+/=?^_`{|}~-]+"
>>         ++ "(?:\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*"
>>         ++ "@(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+"
>>         ++ "(?:[a-zA-Z]{2}|com|org|net|gov|mil"
>>         ++ "|biz|info|mobi|name|aero|jobs|museum)",
>>     io:format("about to run...~n"),
>>     Ret = re:run(Str, EMail_regex),
>>     io:format("Ret is ~p~n", [Ret]).
>>
>> Eliminating the @ in either the string or the regex and it will
>> terminate - but if you don't it wont...
>
> $ pcretest
> PCRE version 7.4 2007-09-21
>
>  re> /[a-z0-9!#$%&'*+\/=?^_`{|}~-]+(?:.[a-z0-9!#$%&'*+\/=?^_`{|}~-]+)*@(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?.)+(?:[a-zA-Z]{2}|com|org|net|gov|mil|biz|info|mobi|name|aero|jobs|museum)/
>  data> http:/www.flickr.com/slideShow/index.gne?group_id=&user_id=69845378@REDACTED
>  Error -8
>
> "-8" will happen if the match() call counter reaches some limit (by
> default, 10000000). The comments in the header file explain that "the
> limit exists in order to catch runaway regular expressions that take
> for ever to determine that they do not match."
>
> In Erlang, after the regexp matching has performed a number of operations,
> it'll be swapped out. When the regexp matching is resumed, the match()
> counter is zero'ed. I'm not sure why this is done but removing it at
> least allows the match to return:
>
> 1> fail:fail().
> about to run...
> Ret is nomatch
> ok
>
>
> diff --git a/erts/emulator/pcre/pcre_exec.c b/erts/emulator/pcre/pcre_exec.c
> index 5162513..3fe13ca 100644
> --- a/erts/emulator/pcre/pcre_exec.c
> +++ b/erts/emulator/pcre/pcre_exec.c
> @@ -5191,7 +5191,6 @@ for(;;)
>       EDEBUGF(("Loop limit break detected"));
>       return PCRE_ERROR_LOOP_LIMIT;
>   RESTART_INTERRUPTED:
> -      md->match_call_count = 0;
>       md->loop_limit = extra_data->loop_limit;
>       rc = match(NULL,NULL,NULL,0,md,0,NULL,0,0);
>       *extra_data->loop_counter_return =
>
> ________________________________________________________________
> erlang-questions (at) erlang.org mailing list.
> See http://www.erlang.org/faq.html
> To unsubscribe; mailto:erlang-questions-unsubscribe@REDACTED
>


More information about the erlang-questions mailing list