[erlang-questions] character encoding and xmerl

Heinz N. Gies heinz@REDACTED
Thu Jan 5 13:45:40 CET 2012

is a good article about erlang and Unicode, I am not 100% sure but I think it might help you :)

Heinz N. Gies

On Jan 5, 2012, at 10:25, Martin Dimitrov wrote:

> Hello,
> In our app we upload a XML file through simple form. The page is encoded
> in UTF-8 as well as the file.
> YAWS gathers the parts of the file, flattens them and sends them to
> xmerl. The XML is scanned through xmerl_scan:string with {encoding,
> "utf-8"}. When I dump the string the Cyrillic word продукт is printed as
> 208,191,209,128,208,190,208,180,209,131,208,186,209,130.
> After the scan, the Cyrillic word is printed as
> 1087,1088,1086,1076,1091,1082,1090 which, according to my believes, is
> the correct Unicode representation.
> The problem is when our internal structures are exported to XML. Then
> trying to scan the XML again, xmerl reports:
> {fatal,{{unexpected_char,{error,{bad_character,1087}}}
> Thanks in advance,
> Martin
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://erlang.org/mailman/listinfo/erlang-questions

More information about the erlang-questions mailing list