On 27/09/2007, <b class="gmail_sendername">G Bulmer</b> <<a href="mailto:gbulmer@gmail.com">gbulmer@gmail.com</a>> wrote:<div><span class="gmail_quote"></span><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
I guess it's the 'pathological' cases that worry me; "MY software<br>NEVER breaks on the easy cases" :-)<br>Seriously though, having a regexp with stable behaviour wins over one<br>that gets the last 200% of performance *most of the time*, but is
<br>unstable.</blockquote><div><br>I did some tests using Russ Cox example comparing the old regexp module (not a good comparison i know) and a new version I am working on. The results confirmed his findings.<br><br> N 15 18 20 22 25 30 40
<br>regexp 40 360 1570 6900 59000<br>re 0.2 0.25 0.32 0.38 0.5 0.73 1.4<br><br>All times in millisecs. The old regexp uses a backtracking algorithm.
<br> </div>Which tends to show that we will have no problems with pathological cases. :-) If people feel that it is too slow for the simpler regexps and would prefer to use a C library then it is definitely important to choose the *right* library.
<br><br>Robert<br></div><br>