[erlang-questions] Bugs in xmerl_sax_parser?

Lars Thorsen lars@REDACTED
Mon Sep 28 15:18:58 CEST 2009


Hi Tony,
thank you for your information about the errors.

1. xmerl_sax_parse:stream/2 failed with {fatal_error,_, "Continuation 
function undefined, and more data needed",_,_} when no continuation 
function was defined even though it was a complete document as input.

2. The namespace URI supplied on unprefixed attributes in startElement 
tuples is the same as the URI for the default namespace. According to 
the standard the namespace for an unprefixed attribute should always has 
no value.

I've fixed the errors and a patch can be fetched at
"http://www.erlang.org/download/patches/".

The files are named:
otp_src_R13B02_OTP-8213_OTP-8214.patch
otp_src_R13B02_OTP-8213_OTP-8214.readme

Regards Lars Thorsen
OTP Development

Tony Garnock-Jones wrote:
> Hi all,
> 
> I've just been playing with xmerl_sax_parser and I think I've found two
> bugs:
> 
>  1. xmerl_sax_parser:file/2 and :stream/2 behave differently. Create a
>     file, "t.xml", containing
> 
> <elem attr='123' x:attr='234' xmlns='http://lshift.net/d'
> xmlns:x='http://lshift.net/x' />
> 
>     and then run the two expressions
> 
>     xmerl_sax_parser:file("t.xml", [{event_fun, fun (E,_,_) ->
> io:format("~p~n", [E]), ok end}]).
>     xmerl_sax_parser:stream("<elem attr='123' x:attr='234'
> xmlns='http://lshift.net/d' xmlns:x='http://lshift.net/x' />",
> [{event_fun, fun (E,_,_) -> io:format("~p~n", [E]), ok end}]).
> 
>     Note that they print the same event sequences, but the first results
>     in {ok,ok,<<>>}, while the second results in {fatal_error,...}!
> 
>  2. The namespace URI supplied on unprefixed attributes in startElement
>     tuples is the same as the URI for the default namespace. According
>     to http://www.w3.org/TR/xml-names/#defaulting, "The namespace name
>     for an unprefixed attribute name always has no value."
> 
>     For example, running one of the expressions above causes the
>     following to be printed:
> 
>     {startElement,"http://lshift.net/d","elem",
>                   {[],"elem"},
>                   [{"http://lshift.net/d",[],"attr","123"},
>                    {"http://lshift.net/x","x","attr","234"}]}
> 
>     While applications can work around this by ignoring the attribute's
>     namespace name in cases where the prefix == "", I wonder if it
>     wouldn't be better to supply "" for the namespace name for
>     unprefixed attributes, like this:
> 
>     {startElement,"http://lshift.net/d","elem",
>                   {[],"elem"},
>                   [{[],[],"attr","123"},
>                    {"http://lshift.net/x","x","attr","234"}]}
> 
> Regards,
>   Tony



More information about the erlang-questions mailing list