I ran into the same problem in the latest release.<br>I found out that if you tell the parser what encoding to expect like this:<br><br>{Xml, _} = xmerl_scan:string(XmlString, [{encoding, "iso-10646-utf-1"}]),<br>

<br>it will handle UTF-8 correctly. Of course it will only work if you know the encoding in advance. A better solution will be for the parser to understand the correct header and also default to UTF-8 like in the previous versions.<br>

<br>- Mikkel<br><br><div class="gmail_quote">On Fri, Feb 13, 2009 at 11:38 AM, Michal Ptaszek <span dir="ltr"><<a href="mailto:michal.ptaszek@erlang-consulting.com">michal.ptaszek@erlang-consulting.com</a>></span> wrote:<br>

<blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">Hi All,<br>

<br>

After the migration from R12B4 to R12B5 (xmerl version changed from 1.1.9 to 1.1.10)<br>

I have noticed something probably unwanted.<br>

<br>

During the document processing phase, the wfc_Legal_Character fatal error is thrown even<br>

if I use the proper header (<?xml version="1.0" encoding="utf-8"?>).<br>

<br>

The previous version of xmerl was dealing with UTF-8 encoded characters flawlessly,<br>

the newest one unfortunately does not want to cooperate.<br>

<br>

Is it a xmerl bug/intended feature/my xmerl misunderstanding (if so, how to parse document<br>

containing UTF-8 encoded characters correctly)?<br>

<br>

Best regards,<br>

<font color="#888888">--<br>

Michal Ptaszek<br>

<a href="http://www.erlang-consulting.com" target="_blank">www.erlang-consulting.com</a><br>

_______________________________________________<br>

erlang-questions mailing list<br>

<a href="mailto:erlang-questions@erlang.org">erlang-questions@erlang.org</a><br>

<a href="http://www.erlang.org/mailman/listinfo/erlang-questions" target="_blank">http://www.erlang.org/mailman/listinfo/erlang-questions</a><br>

</font></blockquote></div>