[erlang-questions] beginner: Generating HTML with ">" from Erlang
Bengt Kleberg
bengt.kleberg@REDACTED
Fri Feb 14 13:55:18 CET 2014
Greetings,
Sorry about the delay.
Have I misunderstood that #xmlText{} would help to preserve "<" when
doing xmerl:export_simple/2?
I tried this input:
{script, [], [#xmlText{value=Script}]}
where script has some
if (i < 0) { ...
It ends up in the HTML as
if (i i< 0) { ...
bengt
On Fri, 2014-02-14 at 11:42 +0100, Richard Carlsson wrote:
> Thank you. So, to summarize, the interpretation of what's between the
> <script> and </script> tags depends on whether you parse the text as XML
> or HTML. Some more googling shows that the following is a common idiom
> to allow the text to be parsed in both ways:
>
> <script type="text/javascript">
> // <![CDATA[
> ... if (i < 0) { ...
> // ]]>
> </script>
>
> The // keeps Javascript from seeing the CDATA start/end markers, HTML
> ignores them, and XML removes them and passes on everything in between
> as it is.
>
> I think the #xmlText{} record in xmerl (see xmerl.hrl) can be used here
> to wrap the contents of the <script> element so it gets written verbatim
> when you export the XML data.
>
> /Richard Carlsson
>
> On 2014-02-14 03:02 , Richard A. O'Keefe wrote:
> >
> > On 14/02/2014, at 3:31 AM, Richard Carlsson wrote:
> >> Out of curiosity, if it had been < instead, which of the following would work?
> >>
> >> if (i < 0) {
> >
> > That should work in XHTML but not HTML.
> >>
> >> if (i < 0) {
> >
> > That should work in HTML but not XHTML.
> >
> > XHTML is an application of XML. It declares
> > <!ELEMENT script (#PCDATA)>
> > and we have
> > [14] CharData ::= [^<&]* - ([*<&]* ']]>' [^<&]*)
> >
> > That is, a chunk of character data is any run of characters
> > not containing '<' or '&' or ']]>'.
> >
> > The ampersand character (&) and the left angle
> > bracket (<) MUST NOT appear in their literal form,
> > except when used as markup delimiters, or within
> > a comment, a processing instruction, or a CDATA
> > section. If they are needed elsewhere, they
> > MUST be escaped using either numeric character
> > references or the strings "&" and "<"
> > respectively. The right angle bracket (>) may be
> > represented using the string ">", and MUST, for
> > compatibility, be escaped using either ">" or a
> > character reference when it appears in the string
> > "]]>" in content, when that string is not marking
> > the end of a CDATA section.
> >
> > #PCDATA may also contain entity references (<),
> > character references (<), comments,
> >
> >>
> >> If it is the first case, there is presumably a very specific rule for this,
> >
> > The legality of "i < 0" in XHTML falls out of general rules
> > and the content model of the <script> element.
> >
> > As far as HTML is concerned, it's not illegal, but HTML
> > will pass the '<' on verbatim to Javascript, which doesn't
> > like it.
> >
> >> If it's the second case, how is the script text really supposed to be handled by XML tools? As CDATA (then, how is it delimited?)
> >
> > XML has <![CDATA[...]]> *marked sections*, but it
> > does NOT have CDATA *content models*.
> >
> >> or as normal XML text (and then how can the < be accepted by the parser,
> >
> > In HTML, a "<" character followed by white space is perfectly legal;
> > in XML, it is not.
> >
> >> and why wasn't > converted to > before the Javascript parser got hold of the text)?
> >
> > Possibly because the web browser got it wrong.
> >
> > CDATA and RCDATA content models in SGML were broken by design.
> > Such an element beginning with <foo> should only have been
> > terminated by </foo>, but they're terminated by *any* '</'
> > followed by any of > ( or letter.
> > It had already been explained extremely clearly *before* the
> > <SCRIPT> element was added to HTML that the content model
> > should have been (#PCDATA) using <![CDATA[ sections for quoting.
> >
> >
>
More information about the erlang-questions
mailing list