[erlang-questions] Exception in xmerl, when pasing XML with non UTF8 character set

Bertil Karlsson bertil.karlsson@REDACTED
Mon Jan 7 08:43:19 CET 2008

If I'm right windows-1252 uses its own conversion table that doesn't 
exist in xmerl today. Just changing the encoding to something that seems 
to work may cause trouble when it comes to those characters that differs.
It is not difficult to add the changes needed to xmerl, but I cannot 
promise it into the next release.


Zvi wrote:
> 3> { Xml, _Rest } = xmerl_scan:file(ResultIdx).
> ** exception exit: {bad_character_code,
>                        "<!DOCTYPE BODY SYSTEM
> "http://www.xxx.com/yyy.dtd\">\n<BODY>\n<RENDERING>\naaa</RENDERING>\n</BODY>\n",
>                        'windows-1252'}
>      in function  xmerl_ucs:to_unicode/2
>      in call from xmerl_scan:scan_document/2
>      in call from xmerl_scan:file/2
> The XML document starts with PI: <?xml version="1.0"
> encoding="windows-1252"?>
> It works, after changing it to 
>    <?xml version="1.0" encoding="utf-8"?>
> The problem is that this XML document generated by 3rd party SW, so I would
> like to fix xmerl code, or use some xmerl option.
> I using R12B on Windows.
> Zvi

More information about the erlang-questions mailing list