I ran into the same problem in the latest release.<br>I found out that if you tell the parser what encoding to expect like this:<br><br>{Xml, _} = xmerl_scan:string(XmlString, [{encoding, "iso-10646-utf-1"}]),<br>
<br>it will handle UTF-8 correctly. Of course it will only work if you know the encoding in advance. A better solution will be for the parser to understand the correct header and also default to UTF-8 like in the previous versions.<br>
<br>- Mikkel<br><br><div class="gmail_quote">On Fri, Feb 13, 2009 at 11:38 AM, Michal Ptaszek <span dir="ltr"><<a href="mailto:michal.ptaszek@erlang-consulting.com">michal.ptaszek@erlang-consulting.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">Hi All,<br>
<br>
After the migration from R12B4 to R12B5 (xmerl version changed from 1.1.9 to 1.1.10)<br>
I have noticed something probably unwanted.<br>
<br>
During the document processing phase, the wfc_Legal_Character fatal error is thrown even<br>
if I use the proper header (<?xml version="1.0" encoding="utf-8"?>).<br>
<br>
The previous version of xmerl was dealing with UTF-8 encoded characters flawlessly,<br>
the newest one unfortunately does not want to cooperate.<br>
<br>
Is it a xmerl bug/intended feature/my xmerl misunderstanding (if so, how to parse document<br>
containing UTF-8 encoded characters correctly)?<br>
<br>
Best regards,<br>
<font color="#888888">--<br>
Michal Ptaszek<br>
<a href="http://www.erlang-consulting.com" target="_blank">www.erlang-consulting.com</a><br>
_______________________________________________<br>
erlang-questions mailing list<br>
<a href="mailto:erlang-questions@erlang.org">erlang-questions@erlang.org</a><br>
<a href="http://www.erlang.org/mailman/listinfo/erlang-questions" target="_blank">http://www.erlang.org/mailman/listinfo/erlang-questions</a><br>
</font></blockquote></div>