[erlang-questions] regexp sux!

Darius Bacon darius@REDACTED
Wed May 30 08:22:37 CEST 2007


"Christian S" <chsu79@REDACTED> wrote:
> Would it be easy to make regexps incremental, in such that i can
> feed a regexp a string, and then find out if the string contained enough
> for the regexp to match/halt, or if i need to continue feeding it with
> more data.
> 
> Imagine I am receiving data blocks with an active gen_tcp socket, and
> my potential regexp match is across the boundary of two received data
> blocks.
> 
> I'm imagining an interface somewhat similar to md5_init/md5_update but
> with an exit to let me know there was a match, at which point i would
> get the matched part and the remaining part returned to me so i can
> continue matching on the latter when i have processed the matched
> block.

I don't think the tree-recursive matchers do that naturally, but what
you want does correspond directly to DFA states or regular-expression
derivatives. I just hacked up an implementation of derivatives and
bundled it with my earlier regexp code at
http://jungerl.sourceforge.net/ in the library named 'ergex'.

Darius



More information about the erlang-questions mailing list