Language change proposal

Joachim Durchholz joachim.durchholz@REDACTED
Tue Nov 4 01:35:08 CET 2003


Michael Hobbs wrote:
> That line seems to imply that if an entity contains an encoding
> declaration, then the whole entity must be encoded with that encoding.
> This presents a chicken-or-egg problem in that how is an XML processor to
> process an encoding declaration before it knows what the encoding is?

The first byte of an entity is always a specific character (probably "<" 
for XML).
Assuming the entity is correct, the XML processor can infer at least a 
first estimate of what encoding was used, and later check it against the 
encoding declarations.

> So, to bring the wagons back around to Erlang, if there ever is an
> -erlang(Encoding, Version) declaration, it would be nice if it is clearly
> stated what encoding should be used for the "-erlang(Encoding, Version)"
> text.

The declaration should use the same encoding as the rest of the source file.
Proceed as follows:

IF first two characters are hex FEFF or FFFE THEN
   assume Unicode
ELSEIF first character is EBCDIC encoding of "-" THEN
   assume the intersection of all EBCDIC code pages
   parse first line
   IF it's not something like "-erlang(EBCDIC-whatever, Version)" THEN
     report error, abort compilation
   END
   load requested EBCDIC code page
ELSEIF first character is ASCII encoding of "-" THEN
   assume the intersection of all ASCII code pages (Latin-1, ...)
   parse first line
   IF it's not something like "-erlang(Encoding, Version)"
      (where "Encoding" is one of the supported ASCII code pages)
   THEN
     report error, abort compilation
   END
   load requested ASCII code page
ELSE
   assume 7-bit ASCII
ENDIF

Just my 2c.

Regards,
Jo




More information about the erlang-questions mailing list