pcre, bifs, drivers and ports

Robert Virding robert.virding@REDACTED
Wed Aug 2 01:54:38 CEST 2006

If you write it in Erlang then this is not that difficult to do.


Scott Lystig Fritchie wrote:
>>>>>>"sh" == Sean Hinde <sean.hinde@REDACTED> writes:
> sh> Mats' comment about limiting length of REs does
> sh> not really cut it IMO. Blocking the whole emulator during a long
> sh> regexp calculation rarely sounds like the right solution for
> sh> typical Erlang apps.
> One more thing to consider.  A *really* useful regexp library (or
> any library that deals with strings) would be one that worked on:
>     1. lists of byte values (the traditional Erlang "string")
>     2. single binary terms
>     3. "I/O lists", an arbitrarily deep list of #1 and/or #2.
>        (Or #2 alone :-)
> I would guess that that would come at a high cost implementaion, since
> most C/C++ regexp packages operate on buffers of contiguous bytes, not
> a string of bytes located in perhaps thousands of non-contiguous
> places.
> Oops, I forgot one:
>     4. A possibly UNICODE/whatever internationalized "string" thingie
>        stored in an I/O list.
> As discussed on this list a few weeks ago, there is no agreement on
> how to represent such a thing ... in Erlang or most other languages.
> sh> But. It would be most fascinating to compare real world
> sh> characteristics of:
> sh> 1. BIF pcre 2. Driver pcre, 3. BIF pcre in SMT erlang.
> Yup.
> A linked-in driver can cheat even more if it can get a (internal C)
> pointer to the term(s) it's operating on.  It's quite easy to create a
> new BIF that returns the internal pointer/address of its argument and
> return it as an integer.(*)  Turning that integer into a pointer, the
> driver has full access to the term.  Use the power only for good.  :-)
> -Scott
> (*) It is a good, simple experiment if you've never tried writing a
> BIF before.

More information about the erlang-questions mailing list