<html>
<head>
<meta content="text/html; charset=ISO-8859-1"
http-equiv="Content-Type">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<div class="moz-cite-prefix">Not a bug, just a surprise -<br>
The string<br>
<font face="Arial" size="2"><span style="font-size:10pt;">"/\\*.*?\\*/"</span></font><br>
does only contain two backslashes:<br>
<blockquote>> length([C || C <- "/\\*.*?\\*/", C==92]). %
The character code for backslash being 92.<br>
2<br>
</blockquote>
The string literal in the *Erlang source file*[1], of course,
plainly contains four of them.<br>
The surprise is that there are two interpreters involved -- two
layers, each of which requires backslash-escaping:<br>
<ol>
<li>The Erlang parser reads [Backslash, Backslash] and puts a
single backslash into the string literal (rather than taking
the second backslash as a signal to start an escape sequence).</li>
<li>The "re" module reads [Backslash, Asterisk] and interprets
this by taking the asterisk literally (rather than as a
zero-or-more modifier).<br>
</li>
</ol>
<br>
Or going the other way: If we desire a literal asterisk in the
pattern, we must escape it, so that "re" sees a backslash in front
of the asterisk.<br>
Thus, we want a backslash in the string. And if we want to write
that as a string literal, then in order for the string to contain
a backslash, we must put another backslash in front of it in the
Erlang source code, because backslash means something special to
the Erlang parser as well.<br>
<br>
/Erik<br>
<br>
[1] Or in this case, the expression typed into the Erlang shell.<br>
<br>
On 20-11-2012 09:17, Arif Ishaq wrote:<br>
</div>
<blockquote
cite="mid:1CAB695D2C2A8F4BB0B242A5B44C75E901A904@ESESSMB301.ericsson.se"
type="cite">
<meta http-equiv="Content-Type" content="text/html;
charset=ISO-8859-1">
<meta name="Generator" content="Microsoft Exchange Server">
<!-- converted from rtf -->
<style><!-- .EmailQuote { margin-left: 1pt; padding-left: 4pt; border-left: #800000 2px solid; } --></style>
<font face="Arial" size="2"><span style="font-size:10pt;">
<div>Hi,</div>
<div> </div>
<div>The backslash character as escape doesn't work as
expected.</div>
<div> </div>
<div style="padding-left:19pt;">Erlang R15B (erts-5.9)
[smp:4:4] [async-threads:0]</div>
<div style="padding-left:19pt;"> </div>
<div style="padding-left:19pt;">Eshell V5.9 (abort with ^G)</div>
<div style="padding-left:19pt;">1> String = "/* this is a C
comment */".</div>
<div style="padding-left:19pt;">"/* this is a C comment */"</div>
<div style="padding-left:19pt;">2> re:run(String,
"/\*.*?\*/").</div>
<div style="padding-left:19pt;">** exception error: bad
argument</div>
<div style="padding-left:19pt;"> in function re:run/2</div>
<div style="padding-left:19pt;"> called as re:run("/*
this is a C comment */","/*.*?*/")</div>
<div style="padding-left:19pt;">3> re:run(String,
"/\\*.*?\\*/").</div>
<div style="padding-left:19pt;">{match,[{0,25}]}</div>
<div style="padding-left:19pt;">4> </div>
<div style="padding-left:19pt;"> </div>
<div>Best regards</div>
<div> </div>
<div> </div>
<div>PS. The documentation in
"erl5.9/lib/stdlib-1.18/doc/html/re.html" says: </div>
<div> </div>
<div>".. the pattern</div>
<div> </div>
<div>/\*.*?\*/</div>
<div> </div>
<div>does the right thing with the C comments."</div>
<div> </div>
</span></font>
</blockquote>
<br>
<br>
<div class="moz-signature">-- <br>
<div style="color: black; text-align: center;"> <span>Mobile: +
45 26 36 17 55</span> <span> </span> <span style="color:
black; ">| Skype: eriksoesorensen</span> <span> </span> <span
style="color: black; ">| Twitter: @eriksoe</span>
</div>
<div style="text-align: center; color: gray;"> <span>Trifork A/S
| Margrethepladsen 4 | DK-8000 Aarhus C | </span> <a
href="http://www.trifork.com/"><span style="text-decoration:
underline; color: gray;">www.trifork.com</span></a>
</div>
</div>
</body>
</html>