[erlang-questions] unicode in string literals
Wed Aug 1 09:30:19 CEST 2012
First, thanks for the detailed explanation. I see I am still confusing
some of the issues.
On Wed, Aug 1, 2012 at 3:56 AM, Richard O'Keefe <> wrote:
> On 31/07/2012, at 7:36 PM, Vlad Dumitrescu wrote:
> It's not clear to me what you mean by a 'project',
I mean a set of related code, some of it possibly third-party.
> but why should a module written by someone who wants
> comments in Māori (note the macron? Latin-4 or Unicode needed)
> use a module written by someone who wants comments in Swedish?
Maybe not in the long run, but there will be a (long) transition
period where legacy code will still be used by new code.
> The whole point of an -encoding directive is that it is something
> that syntaxtools should handle; by the time your code gets an AST
> or a token list, encodings are entirely a thing of the past.
Yes, but I am one of the guys that is going to write some of the tools
that will handle this conversion, so I do care about the details.
> SWI Prolog actually lets you change the encoding within a file,
> which sounds crazy but maybe Jan wanted the machinery to be there
> in case someone wanted ISO 2022 support. (Because that's basically
> what 2022 *is*: switching encoding aspects on the fly.)
Are there any editors that can load/save a file with mixed encodings like that?
> Converting between strings and binaries is the one place where Erlang
> source code should have any reason to care, and it does have a reason
> to care. But you will perceive that it is the *binary* that needs to
> be associated with an encoding, not the *string*.
> of the system
Right. Good explanation!
I am still a little worried about two things:
- debugging a remote system that has different locale
- reading logs created by modules that have different encodings (some
modules might be legacy and not be aware that the world is not Latin-1
More information about the erlang-questions