Robert Virding rv@REDACTED
Sun Nov 25 22:16:37 CET 2001

Niels Christensen <christen@REDACTED> writes:
>Reading the documentation for regexp, I am surprised that
>2> regexp:first_match("<DATE>22-03-03</DATE>","<.+>").
>I should have thought (and wanted!) the result to be

Here again to combine and confirm the other replies to this question.

regexp:match will search the whole string to find the longest match.
If there is more than one match with the same length then it will
choose the first one.

regexp:first_match will choose the first match, but it is also greedy
and returns the longest possible match.  Which is what you discovered.
Originally it just took the first match (as you wanted) but when
someone "optimised" the code this behaviour changed.

I don't which is better.  At least it is now consistently greedy.

Perhaps you can say that regexp:match returns the "first longest"
while regexp:first_match returns the "longest first".  How about a
"last shortest"? :-)

Yes . is consistent with other regular expressions and matches any
character except \n.


More information about the erlang-questions mailing list