[erlang-questions] character encoding and xmerl

Heinz N. Gies heinz@REDACTED
Thu Jan 5 13:45:40 CET 2012


http://ferd.ca/will-the-real-unicode-wrangler-please-stand-up.html
is a good article about erlang and Unicode, I am not 100% sure but I think it might help you :)

--
Heinz N. Gies
heinz@REDACTED
http://licenser.net

On Jan 5, 2012, at 10:25, Martin Dimitrov wrote:

> Hello,
> 
> In our app we upload a XML file through simple form. The page is encoded
> in UTF-8 as well as the file.
> 
> YAWS gathers the parts of the file, flattens them and sends them to
> xmerl. The XML is scanned through xmerl_scan:string with {encoding,
> "utf-8"}. When I dump the string the Cyrillic word продукт is printed as
> 208,191,209,128,208,190,208,180,209,131,208,186,209,130.
> 
> After the scan, the Cyrillic word is printed as
> 1087,1088,1086,1076,1091,1082,1090 which, according to my believes, is
> the correct Unicode representation.
> 
> The problem is when our internal structures are exported to XML. Then
> trying to scan the XML again, xmerl reports:
> 
> {fatal,{{unexpected_char,{error,{bad_character,1087}}}
> 
> 
> Thanks in advance,
> 
> Martin
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://erlang.org/mailman/listinfo/erlang-questions




More information about the erlang-questions mailing list