[erlang-questions] String encoding and character set

Wed Jan 17 01:59:51 CET 2007

My guess is that with a string format you can access the nth character of
the message by its position, which can be very difficult to do with a list
if the encoding support different size for different characters (and
sometimes the same character can have different encoding depending of
previous ones: contextual encoding) ... I guess the string type abstract all
that, but list is enough encoding like ASCII, UTF8. So two questions: (1) am
I clear? (2) if yes, am I right? ;)

On 1/17/07, Robert Virding <robert.virding@REDACTED> wrote:
>
> We do actually, in fact we have something much much better, a list.
> Using a list you don't have to worry about encodings but can use the
> unicode value directly in the string/list. This makes all processing
> much easier. Then when you are done you can convert it to what ever
> encoding you want.
>
> I don't really understand why anyone would want to process data in an
> unnecessarily complex format instead of a simple one.
>
> Robert
>
> dda wrote:
> > String types – at least well-implemented ones – don't just store a
> > string, but also encoding information. They are/should be geared
> > towards pain-free manipulation of text data, and by text I mean things
> > outside ASCII-land. Encodings-aware string manipulation functions
> > don't function on bytes, but on characters, a quite different notion.
> > We don't have this in Erlang.
> >
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://www.erlang.org/mailman/listinfo/erlang-questions
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20070117/d6f8485d/attachment.htm>