[erlang-questions] beginner: xmerl:export_simple/2 wihtout changing > into >

Richard A. O'Keefe ok@REDACTED
Thu Feb 13 23:51:00 CET 2014


On 14/02/2014, at 1:20 AM, Anthony Ramine wrote:

> No. That would be invalid XML, not even well-formed, even.
> 
> Why would an XML implementation allow you to generate ill-formed XML?

Let's see.
XML fifth edition.
Rule 9: an attribute value may not contain a less than
sign, but there is no problem with greater than signs.
Rule 14: character data may not contain a less than sign
and it may not contain the sequence ']]>', but there is
no restriction on greater than signs that are NOT preceded
by two right brackets.
Rule 20: a CDATA section may not contain the sequence ']]>'
and there is no way to escape any of those characters, so
even _after_ parsing a CDATA section may not contain ']]>'.
But there is no other restriction on greater than signs.

In particular,
	<p>"Some text with >"</p>
and	<![CDATA["Some text with >"]]>
would appear to be perfectly legal.

However,
- the general difficulty of ensuring that ]]> does not occur
  and the absence of a predefined name for ']' makes it
  a good idea to always replace > by >
- any XML tool chain that makes a program *have* to be aware
  of a distinction between
	<p>"Some text with >"</p>
  and	<p>"Some text with >"</p>
  is one that I would have to be paid millions of dollars to use.

So in short, how could anyone possibly "need" to keep '>'
as '>' rather than '>' or '>'?

Not many people are aware of the illegality of ]]> in plain
text.  None of the XML parsers I use is aware of it.  But it
has always been illegal according to the XML specifications.




More information about the erlang-questions mailing list