DTD / xmerl_scan:file help !!!!! {invalid_nmtoken,"*) #RE"

Richard A. O'Keefe ok@REDACTED
Thu Jul 14 00:24:19 CEST 2005


"Sanjaya Vitharana" <sanjaya@REDACTED> wrote:
	<!ATTLIST key
	      no (1|2|3|4|5|6|7|8|9|z|r|c|*) #REQUIRED>
	    ---Note the * (astric in the end)
	
The asterisk (NOT "astric") is *NOT ALLOWED* in an enumerated attribute.
Here are the relevant rules from the XML specification (one does expect
people programming with XML to read the XML specification...):

   [59] Enumeration ::= '(' S? Nmtoken (S? '|' S? Nmtoken)* S? ')'
   [7]  Nmtoken     ::= (Namechar)+
   [4]  Namechar    ::= Letter | Digit | '.' | '-' | '_' | ':'
                     |  CombiningChar | Extender

Roughly speaking, an Nmtoken is any non-empty sequence of characters
that may occur in an identifier, not necessarily starting with a letter.
So 1 is an Nmtoken and c is an Nmtoken but * is *NOT* an Nmtoken.

	Error I got is.
	
	1> {ParsResult2,Misc2}=3Dxmerl_scan:file('ivr.xml',[{validation,true
	}]).
	2798- fatal: {invalid_nmtoken,"*) #RE"}
	** exited: {fatal,{{invalid_nmtoken,"*) #RE"},
	                   {file,"./ivr.dtd"},
	                   {line,21},
	                   {col,47}}} **
	
	Seems to be * (astric) is invalid in DTD (or may be xmerl_scan =
	validation problem).
	
Any validating XML parser *MUST* report this error somehow.
This particular error message is, for a wonder, just right.
It tells you something true ("*" is not an Nmtoken) and relevant (by
implication, an Nmtoken is required there) and gives you the right location.

	But i want to use * (astric) in my XML file as below.  So there
	must be a way to define it in DTD ..... can any DTD expert help
	me in this case.

My first suggestion is TRY WANTING SOMETHING ELSE.
Why not try wanting '-' or '.' instead?  They *are* Nmtokens.

My second suggestion, if there is someone pointing a gun at your head
saying "use the un-XML-like asterisk here or DIE", is to give up the
two things that enumerated attributes provide:
 - a validating parser checks that the value of an enumeration attribute
   is in fact one of the listed values
 - a validating parser removes white space around the Nmtoken in such an
   attribute value, so that 'r', 'r ', ' r ' all give the same 'r' value.
If you are willing to give those up, then all you have to write is

	<!ATTLIST key CDATA #REQUIRED>

and any old string will be accepted by the XML parser.  This means that
it is then your problem to check the values given a parsed document,
rather than the parser's problem.




More information about the erlang-questions mailing list