Hello group,<div><br></div><div>Running 5.6.5 under Windows...</div><div><br></div><div>I've got a bunch of code that's "almost but not quite syntactically correct" XML, and I'm trying to convert it to valid XML. Part of this process involves removing some invalid CDATA tags.<br>
</div><div><br></div><div>My code fragment:</div><div> re:replace("abc123", "<!\[CDATA\[<", "<", [{return, list}]).</div><div>is giving me "exception error: bad argument in function re:replace/4.</div>
<div><br></div><div>Trial and error shows that removing the escaped [ characters:</div><div> re:replace("abc123 <![CDATA[< abc123", "<!CDATA<", "<" [{return, list}]).</div><div>
works as expected, but it's obviously not what I want.</div><div><br></div><div>However, "double-escaping" the [ characters (by adding a second \ prior to the [ character) does exactly what I want:</div><div>
re:replace("abc123 <![CDATA[< abc123", "<!\\[CDATA\\[<", "<", [{return, list}])</div><div>returns "abc123 < abc123", which is the result I'm after.</div><div>
<br></div><div>In this context, I guess it's conceivable that the [ character can be misinterpreted in two distinct ways in a regular expression:</div><div>- it could denote the start of an Erlang list</div><div>- it could denote the start of a character grouping within a regular expression</div>
<div>However, I didn't expect that "double escaping" it would be the solution to my problem.</div><div><br></div><div>Is this expected behaviour, or some sort of anomaly? In any case, sending this email to the mailing list should help out the next person who falls into this trap, but who can use Google to track down the solution...</div>
<div><br></div><div>Regards</div><div><br></div><div>David Mitchell</div>