[erlang-questions] unicode in string literals

CGS cgsmcmlxxv@REDACTED
Tue Jul 31 14:25:35 CEST 2012


Correct. My bad.

Still, a question remains: how does the compiler make any difference in
between a list of integers and a string coded in UTF-8? For example,
consider the following case: a list of indexes vs. a string containing
special characters in UTF-8. If you apply lists:reverse/1 in UTF-8, you get
undesired list for the reversed list of indexes and, vice-versa, if you
apply lists:reverse/1 in Latin-1 you get an undesired reversed list for
your string. And I don't suppose "-encoding()" would solve this problem
either. By dividing the problem in two types of list manipulation, one can
easily decide where to apply what.

CGS



On Tue, Jul 31, 2012 at 11:48 AM, Masklinn <masklinn@REDACTED> wrote:

> On 2012-07-31, at 11:36 , CGS wrote:
> >
> > I might be wrong, but, switching to default UTF-8, wouldn't that force
> the
> > compiler to use 2-byte (at least) per character?
>
> No? The first 128 code points (ASCII) fit in a single byte.
>
> > If so, for example, what
> > about the databases based on Erlang for projects using strict Latin-1?
>
> The ASCII (7-bit) characters would be stored on 1 byte, those beyond
> that (until the codepoint 2048) would be on 2 bytes.
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://erlang.org/mailman/listinfo/erlang-questions
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20120731/053d8e3a/attachment.htm>


More information about the erlang-questions mailing list