[erlang-patches] Patch to xmerl_scan to fix character-reference normalization in attribute values

Henrik Nord henrik@REDACTED
Tue May 3 15:17:08 CEST 2011


On 04/28/2011 11:24 PM, Tom Moertel wrote:
> The following short patch fixes a bug in xmerl that causes character 
> references in attribute values to be normalized incorrectly:
>
>     git fetch https://github.com/tmoertel/otp.git xmerl_attr_charref_fix
>
> Explanation:
>
> Section 3.3.3 of the XML Recommendation gives the rules for
> attribute-value normalization.  One of those rules requires
> that character references not be re-normalized after being
> replaced with the referenced characters:
>
>     For a character reference, append the referenced
>     character to the normalized value.
>
> And, in particular:
>
>     Note that if the unnormalized attribute value contains
>     a character reference to a white space character other
>     than space (#x20), the normalized value contains the
>     referenced character itself (#xD, #xA or #x9).
>
>     Source: http://www.w3.org/TR/xml/#AVNormalize
>
> In xmerl_scan, however, character references in attributes are
> normalized again after replacement.  For example, the
> character reference "&#xA" in the following XML document gets
> normalized (incorrectly) into a space when parsed:
>
>     2> xmerl_scan:string("<root x='&#xA;'/>").
>     {... [{xmlAttribute,x,[],[],[],[],1,[]," ",false}] ...}
>
> This short patch restores the correct behavior:
>
>     2> xmerl_scan:string("<root x='&#xA;'/>").
>     {... [{xmlAttribute,x,[],[],[],[],1,[],"\n",false}] ...}
>
> NOTE:  This change does not include tests because I could not
> find a test suite for xmerl.
>
>
>
> Cheers,
> Tom
>
>
> _______________________________________________
> erlang-patches mailing list
> erlang-patches@REDACTED
> http://erlang.org/mailman/listinfo/erlang-patches
Your branch is included in 'opu'
If nothing major breaks you it will be merged into 'dev' shortly

Thank you for the contribution!

-- 
/Henrik Nord Erlang/OTP

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-patches/attachments/20110503/dfd43848/attachment.htm>


More information about the erlang-patches mailing list