[erlang-questions] xmerl_scan for XMPP -- too much lookahead?

Ulf Wiger ulf.wiger@REDACTED
Wed Sep 23 16:27:31 CEST 2009


Tony,

The support for streaming in xmerl_scan is quite hackish,
and is bound to be broken one way or another. I doubt that
it is feasible to correctly handle streams given how
xmerl_scan works.

As it is, you could reasonably ask of it to be a little
bit more discerning - right now, xmerl_eventp only breaks
at whitespace, which is very conservative. The main
problem is that xmerl_scan is undisciplined when it
comes to ensuring that it has enough characters to
pattern-match in the current function head.

E.g.

%% [75] ExternalID ::= 'SYSTEM' S SystemLiteral
%%                   | 'PUBLIC' S PubidLiteral S SystemLiteral
scan_doctype1([], S=#xmerl_scanner{continuation_fun = F}) ->
     F(fun(MoreBytes, S1) -> scan_doctype1(MoreBytes, S1) end,
       fun(S1) -> ?fatal(unexpected_end, S1) end,
       S);
scan_doctype1("PUBLIC" ++ T, S0) ->
     ...

If given a stream fragment, like "PUBL", the matching
above will fail, and xmerl_scan will derail. THIS is
a serious bug - and I'm originally at fault. ;-)

Have you tried using xmerl_sax_parser instead?
The plan is to replace xmerl_scan completely for stream
parsing. And given this, xmerl_eventp is unlikely to see
any major improvements.

BR,
Ulf W



Tony Garnock-Jones wrote:
> Hi all,
> 
> I'm trying to use xmerl_scan to handle XMPP stanzas, and I think I've
> found a problem with it. I don't think it's intended for streaming XML
> in XMPP-like interleaved-request-and-response-document situations.
> 
> The attached program feeds the string "<opentag>" to xmerl_scan. I would
> have expected to receive the open-tag event before blocking for more
> data, but instead it requires at least one more character of input data
> before it will emit the open-tag event! If I instead pass "<opentag> ",
> with a space after the close-bracket, it emits the open-tag event as
> expected, plus the start of an xmlText, before blocking for more data.
> 
> My question, then, is:
> 
>   Is this a bug? Should xmerl_scan supply the open-tag event before
>   blocking, when it is fed "<opentag>"?
> 
> (The specific context of this problem is dealing with the
> <stream:stream> sent by the server at the handshake stage of XEP-114.)
> 
> Regards,
>   Tony
> 
> P.S.: to run the attached program,
>   $ erlc lookaheadbug.erl && erl -run lookaheadbug go
> 
> 
> ------------------------------------------------------------------------
> 
> 
> ________________________________________________________________
> erlang-questions mailing list. See http://www.erlang.org/faq.html
> erlang-questions (at) erlang.org


-- 
Ulf Wiger
CTO, Erlang Training & Consulting Ltd
http://www.erlang-consulting.com


More information about the erlang-questions mailing list