[erlang-bugs] xmerl default encoding
Paul Mineiro
paul-trapexit@REDACTED
Tue Jan 22 06:59:12 CET 2008
when i run the attached document (a simple xml document that lacks an
encoding declaration) through xmerl_scan:file/1 the result contains the
iso-8859-1 encoding of tilde n (\361). however the original contains the
utf-8 encoding of tilde n (\303\261) and the character set change
suprised me. adding a { encoding, "utf-8" } option to xmerl_scan:file/2
fixed things but the reference manual (and xml spec) say the utf-8 is
the default.
thanks,
-- p
Eshell V5.5.5 (abort with ^G)
1> xmerl_scan:file ("noencodingdecl.xml").
{{xmlElement,'Actor',
'Actor',
[],
{xmlNamespace,[],[]},
[],
1,
[],
[{xmlText,[{'Actor',1}],1,[],"Elizabeth Pe\361a",text}],
[],
".",
undeclared},
[]}
2> xmerl_scan:file ("noencodingdecl.xml", [ { encoding, "utf-8" } ]).
{{xmlElement,'Actor',
'Actor',
[],
{xmlNamespace,[],[]},
[],
1,
[],
[{xmlText,[{'Actor',1}],1,[],"Elizabeth Pe\303\261a",text}],
[],
".",
undeclared},
[]}
-------------- next part --------------
A non-text attachment was scrubbed...
Name: noencodingdecl.xml
Type: application/xml
Size: 52 bytes
Desc:
URL: <http://erlang.org/pipermail/erlang-bugs/attachments/20080121/c06a332d/attachment.wsdl>
More information about the erlang-bugs
mailing list