[erlang-questions] Fast regular expression implementation
Thu Dec 21 01:36:31 CET 2006
Have you ran any benchmarks comparing your implementation to the OTP
regexp and/or the revised on on trapexit? Also, can you please give us
a hint as to what makes your implementation faster?
On 12/18/06, Gaspar Chilingarov <nm@REDACTED> wrote:
> Hi all!
> I wish to announce implementation of regular expressions in erlang,
> which works fast enough to be useful for text processing and extraction.
> Please follow the link for download: http://zanazan.am/erlang/re.html
> There are some things which are not implemented for now (i.e. or
> operator "|" between regexp branches).
> Subpatterns are extracted using (), grouping without extraction is done
> as in a perl - (?:pattern). Multiple nested subpatterns are allowed.
> I've tried to keep behavior as much as possible close to perl patterns.
> All substitute functions are missing at the moment -- I will be glad to
> get suggestions what should be implemented besides standard sub/gsub.
> Library is quite fast - 18kb text matches against
> "class=g.*?<a\s+class=l\s+href=\"(.*?)\">(.*?)</a>" pattern to extract
> all matches in 10-12ms (if you ask only for positions). If you ask only
> for subpattern matches (i.e. re:mgg) it works only 18ms.
> Same text duplicated together 100 times (1.8Mb) is matched in a first
> case in 1.2sec, with subpatterns text extraction - about 2.5sec, so
> matching time grows linearly. In case of gregexp implementation time in
> a exponential manner.
> I would like to listen any feedback and especially bug reports.
> Gaspar Chilingarov
> System Administrator,
> Network security consulting
> t +37493 419763 (mob)
> i 63174784
> e nm@REDACTED
> erlang-questions mailing list
More information about the erlang-questions