Is it possible for someone from the OTP team to confirm if this is a bug or not?<br><br>If it is I could really use a patch :-)<br><br>- Mikkel<br><br><div class="gmail_quote">On Fri, Jun 27, 2008 at 2:57 PM, Mikkel Jensen <<a href="mailto:mj@issuu.com">mj@issuu.com</a>> wrote:<br>
<blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">It seems there is a bug in xmerl when loading elements that contain numeric character references followed by UTF-8 characters.<br>
<br>Example: é newline é<br><br>1> element(1, xmerl_scan:string("<a>\303\251
\303\251</a>", [{encoding, 'utf-8'}])).<br>
{xmlElement,a,a,[],<br> {xmlNamespace,[],[]},<br> [],1,[],<br> [{xmlText,[{a,1}],1,[],"\303\251",text},<br> {xmlText,[{a,1}],2,[],[10,195,131,194,169],text}],<br> [],"/",undeclared}<br>
<br>Xmerl splits the parsed value around the newline character (strange but ok). However, the first part is encoded correctly while the second part is garbled!<br><br>It's worth noticing that attribute values are encoded correctly:<br>
<br>2> element(1, xmerl_scan:string("<a b=\"\303\251
\303\251\"/>", [{encoding, 'utf-8'}])).<br>{xmlElement,a,a,[],<br> {xmlNamespace,[],[]},<br> [],1,<br>
[{xmlAttribute,b,[],[],[],[],1,[],"\303\251 \303\251",false}],<br> [],[],"/",undeclared}<br><font color="#888888"><br>- Mikkel<br>
</font></blockquote></div>