[erlang-questions] unicode in string literals
Tue Jul 31 14:25:35 CEST 2012
Correct. My bad.
Still, a question remains: how does the compiler make any difference in
between a list of integers and a string coded in UTF-8? For example,
consider the following case: a list of indexes vs. a string containing
special characters in UTF-8. If you apply lists:reverse/1 in UTF-8, you get
undesired list for the reversed list of indexes and, vice-versa, if you
apply lists:reverse/1 in Latin-1 you get an undesired reversed list for
your string. And I don't suppose "-encoding()" would solve this problem
either. By dividing the problem in two types of list manipulation, one can
easily decide where to apply what.
On Tue, Jul 31, 2012 at 11:48 AM, Masklinn <> wrote:
> On 2012-07-31, at 11:36 , CGS wrote:
> > I might be wrong, but, switching to default UTF-8, wouldn't that force
> > compiler to use 2-byte (at least) per character?
> No? The first 128 code points (ASCII) fit in a single byte.
> > If so, for example, what
> > about the databases based on Erlang for projects using strict Latin-1?
> The ASCII (7-bit) characters would be stored on 1 byte, those beyond
> that (until the codepoint 2048) would be on 2 bytes.
> erlang-questions mailing list
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the erlang-questions