[erlang-questions] regular expression - exponential complexity

Tuncer Ayaz tuncer.ayaz@REDACTED
Wed May 19 22:38:04 CEST 2010


On Wed, May 19, 2010 at 7:48 PM, Ivan Glushkov wrote:
> Hi.
>
> I'm almost sure that you know this, so JFYI.
> I found regexp that has exponential complexity while in perl
> the same regexp seems to grow linearily.
>
> So the question is, are you going to move to re2 or any other
> DFA regexp library?

[snip]

> Some results (f and fc give the same result).
>
> string length / frequency (run cycles per second)
> 3       50000
> 5       30000
> 10      1455
> 11      1136
> 12      586
> 13      292
> 14      146
> 15      73
> 16      37

Have you tried the same within a vanilla pcretest shell
invoked as 'pcretest -dfa'? OTP's PCRE is patched for
restartability/interruptability reasons and also compiled with
the NO_RECURSE/--disable-stack-for-recursion option. The PCRE
docs say this option has a speed impact on pcre_exec() but not
on pcre_dfa_exec(). As re seems to strive for Perl compatibility
erts_pcre makes use of erts_pcre_exec() only. The erts_ prefix is
presumably used to reflect the patched nature of the bundled PCRE. If
you're curious you might write and test a linked-in driver/nif with
stock PCRE :). Actually the comments in pcre_dfa_exec() say that it's
using a 'sort of DFA algo' and not 'a true FSM'. Let's document that
info bit here before anyone asks.

See the pcrestack(3) and pcretest(1) manpages for details.

As you have mentioned it:
If the feature set of RE2 suits your needs have a look at
http://github.com/tuncer/re2.


More information about the erlang-questions mailing list