[erlang-patches] [bug & patch] xmerl_scan doesn't decode &#x refs properly
Raimo Niskanen
raimo+erlang-patches@REDACTED
Tue Jun 8 09:58:51 CEST 2010
On Mon, Jun 07, 2010 at 06:17:47PM +0200, Paul Guyot wrote:
> Hello,
>
> There is a bug in xmerl_scan. It doesn't decode &#x refs properly.
>
> Considering the following code :
>
> {UTF8Output, []} = xmerl_scan:string("<?xml version=\"1\" ?>\n<text>" ++ [229, 145, 156] ++ "</text>"),
> #xmlElement{content = [#xmlText{value = UTF8Text}]} = UTF8Output,
> {DecEntityOutput, []} = xmerl_scan:string("<?xml version=\"1\" ?>\n<text>呜</text>"),
> #xmlElement{content = [#xmlText{value = DecEntityText}]} = DecEntityOutput,
> {HexEntityOutput, []} = xmerl_scan:string("<?xml version=\"1\" ?>\n<text>呜</text>"),
> #xmlElement{content = [#xmlText{value = HexEntityText}]} = HexEntityOutput,
>
> UTF8Text and DecEntityText are equal and as expected ([16#545C]).
> HexEntityText is (incorrectly) a list composed of the three UTF8 bytes [229, 145, 156] while it should be equal to [16#545C].
>
> A patch with a test case can be found here:
>
> git fetch git://github.com/pguyot/otp.git pg/xmerl_scan_hex_entities
Thank you! It will be included in 'pu', after reformatting the commit
message and cherry-pick onto 'dev' since it was not based on 'dev'
but on a merge result containing 'dev'.
>
> Regards,
>
> Paul
> --
> Semiocast http://semiocast.com/
> +33.175000290 - 62 bis rue Gay-Lussac, 75005 Paris
>
>
> ________________________________________________________________
> erlang-patches (at) erlang.org mailing list.
> See http://www.erlang.org/faq.html
> To unsubscribe; mailto:erlang-patches-unsubscribe@REDACTED
>
--
/ Raimo Niskanen, Erlang/OTP, Ericsson AB
More information about the erlang-patches
mailing list