[erlang-questions] UTF8 and EDoc
Tue Oct 6 06:30:32 CEST 2009
2009/10/5 Tomas Abrahamsson <tomas.abrahamsson@REDACTED>:
> An option could be to adopt the way it is done in Python:
> it (re)uses the editor's encoding declaration. If it finds the text
> -*- coding: utf-8 -*- or vim: set fileencoding=utf-8 :
> on the first or second line of the source file, then it sets
> the encoding for the entire source file accordingly. (It also
> understands unicode byte-order marks at the beginning
> of the file, which apparently makes life easier in editors
> on Windows.)
yuk! Not everyone editor has this information?
If a text file needs to inform an app of its encoding, then
a) Enclose the encoding in the file
(xml example encoding='utf-8')
b) Be explicit when calling up the application.
I also think a default encoding as a fallback is essential,
utf-8 being the obvious one.
The BOM (byte order mark) as the first character of a file
has not been successful.
> See http://www.python.org/peps/pep-0263.html for details.
> An advantage with this scheme seems to be that it fits nicely
> with editors. They already know how to handle this.
Only if you use the 'right' editor surely?
> It would probably require the Erlang compiler, edoc, and other tools
> to be modified to know about source file encodings, though.
What of programmatically generated files?
> I suppose that with the \u-escaping, existing tools would continue
> to work without modification, but it would be more work for the
> programmer to type the text in as \u-seqences, unless editors
> already know how to do such a transformation on the fly?
Or mimic python even more?
u"A utf-8 encoded string"
and a unicode('another unicode string')
a string operator and encoding function.
> If no such encoding declaration is found, Python assumes ASCII,
> but Erlang could maybe assume Latin-1.
Please move on to utf-8. Latin-1 is so restrictive..
XSLT XSL-FO FAQ.
More information about the erlang-questions