[erlang-questions] run strange behaviour

Vyacheslav Levytskyy v.levytskyy@REDACTED
Thu Oct 24 02:55:12 CEST 2013


Hello,

According to 're' module documentation, "the quantifiers are "greedy", 
that is, they match as much as possible (up to the maximum number of 
permitted times)". This seems to be a problem with your case. The regex 
you are using seems a bit problematic, forcing 're' to exhausting 
repetitions.

As an option, you can use 'ungreedy' option, making only some of 
quantifiers greedy via following them by "?". See for example:
re:run(<<"foo bar is a foo bar is a big yellow boat or sub">>, <<"^foo 
(\\w(\\w+| )*) is a (\\w(\\w+?| )*?)">>, [ungreedy, global, {capture, 
[1,3], binary}]).
{match,[[<<"bar">>,
          <<"foo bar is a big yellow boat or sub">>]]}

Best regards,
Vyacheslav Levytskyy

On 23.10.2013 22:26, Alexander Petrovsky wrote:
> Hi!
>
> I have the regex "^foo (\\w+(\\w* *)*) is an (\\w+(\\w* *)*)", and I 
> get strange behaviour when I do:
>
> 1> re:run(<<"foo bar is a foo bar is a big yellow boat or">>, <<"^foo 
> (\\w+(\\w* *)*) is a (\\w+(\\w* *)*)">>, [global, {capture, [1,3], 
> binary}]).
> {match,[[<<"bar is a foo bar">>,<<"big yellow boat or">>]]}
>
> 2> re:run(<<"foo bar is a foo bar is a big yellow boat or sub">>, 
> <<"^foo (\\w+(\\w* *)*) is a (\\w+(\\w* *)*)">>, [global, {capture, 
> [1,3], binary}]).
> nomatch
>
> I tested this regexp in clojure and python:
>
> => (re-matches #"foo (\w+(\w* *)*) is a (\w+(\w* *)*)" "foo bar is a 
> foo bar is a big yellow boat or")
> ["foo bar is a foo bar is a big yellow boat or" "bar is a foo bar" "" 
> "big yellow boat or" ""]
>
> => (re-matches #"foo (\w+(\w* *)*) is a (\w+(\w* *)*)" "foo bar is a 
> foo bar is a big yellow boat or sub")
> ["foo bar is a foo bar is a big yellow boat or sub" "bar is a foo bar" 
> "" "big yellow boat or sub" ""]
>
> >>> import re
> >>> p = re.compile('foo (\w+(\w* *)*) is a (\w+(\w* *)*)')
> >>> p.match("foo bar is a foo bar is a big yellow boat or")
> <_sre.SRE_Match object at 0x100293c00>
> >>> p.match("foo bar is a foo bar is a big yellow boat or sub")
> <_sre.SRE_Match object at 0x100293ab0>
>
> Can someone explain me, why I get on second string "foo bar is a foo 
> bar is a big yellow boat or sub" nomatch? This is a bug?
>
>
> -- 
> ?????????? ????????? / Alexander Petrovsky,
>
> Skype: askjuise
> Jabber: juise@REDACTED <mailto:juise@REDACTED>
> Phone: +7 914 8 820 815 (irkutsk)
>
>
>
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://erlang.org/mailman/listinfo/erlang-questions

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20131024/27535fdf/attachment.htm>


More information about the erlang-questions mailing list