[erlang-questions] regexp sux!

Mats Cronqvist mats.cronqvist@REDACTED
Sat Mar 10 10:44:50 CET 2007


Torbjorn Tornkvist wrote:
> As usual, the regexp module drive me nutts.
> I simply want to do the equivalent to:
> 
> # from a bash shell
> 
> xx="<?xml version=\"1.0\" encoding=\"ISO-8859-1\" ?>".
> echo $xx | sed 's/\(.*xml\
> version=.*encoding="\)\([a-zA-Z0-9_-]*\).*/\2/'
> ISO-8859-1
> 
> So how do I do this with the regexp module ?

   this will pull out the encoding string;

XX="<?xml version=\"1.0\" encoding=\"ISO-8859-1\" ?>".
 >{match,B,L}=regexp:match(XX,"encoding=[\"[A-Z0-9-]+"),string:sub_string(XX,B+10,L+B-2).

   but i guess the question was really about how to get regexp groups/submatches 
to work. AFAIK, in that respect regexp is completely brain dead. i use the 
gregexp code by pascal brisset.

 > gregexp:groups(XX,".*encoding=\"\\([A-Z0-9-]+\\)\".*").
{match,["ISO-8859-1"]}

   mats



More information about the erlang-questions mailing list