[erlang-questions] unicode in string literals

Tue Jul 31 09:33:26 CEST 2012

On 31/07/2012, at 7:05 PM, Joe Armstrong wrote:

> Is "encoding(...)"  a good idea?
> 
> There are four reasonable alternatives
> 
>    a) - all files are Latin1

No good for people who need to write (comments, strings, quoted
atoms) in a language not limited to a Western European script.

>    b) - all files are UTF8

No good for people who are perfectly happy with Latin 1 (me!)
and who need the occasional character outside ASCII (like, oh,
some people in Sweden maybe?)  But could be tolerable.

>    c) - all files are Latin1 or UTF8 and you guess

Guessing is always a bad idea.

>    d) - all files are Latin1 or UTF8 or anything else and you tell

It works for XML.  :- encoding(...) works for SWI Prolog:

	:- encoding(+Encoding)
	This directive can appear anywhere in a source file
	to define how characters are encoded in the remainder
	of the file.  It can be used in files that are encoded
	with a superset of ASCII, currently UTF-8 and Latin-1.
	See also section 2.18.1.

A smart editor like Emacs can be taught to recognise
[:]- ?encoding([']Encoding[']).
at the top of a file just as easily as it can recognise its
own mode-lines.