Strings (was: Re: are Mnesia tables immutable?)

Thu Jun 29 19:17:41 CEST 2006

Richard,

On Jun 29, 2006, at 2:41 PM, Richard A. O'Keefe wrote:

>
> True, there are all sorts of good things about UTF-8.  It's really  
> cool
> that modern systems come with UTF-8 locales set by default so I can  
> type
> practically _anything_ in TextEdit.  BUT it's a *Transmission* format,
> that's what the "T" and "F" stand for.  It was never designed to be
> used for serious *processing*
>
> UTF-8 is a great representation for C, but for a language where  
> characters
> never were stored as bytes in the first place it is pretty pointless.
>
> Apparently people are already saying bad things about Erlang string
> handling; what do you think they'll say when they hear that a single
> character might require 8 words (32 bytes)?
>

I'm confused about one point of your posts.
It seems from the above you appreciate that its grossly inefficient  
to use 4 or 8 bytes (64-bit erlang) per character as a method of  
representing strings in erlang.  However, from other messages you  
post in this thread, your proposals seem to still be to use lists of  
integers (one cell per character).
Are you talking about two different things?  One memory efficient  
form for when the string doesn't need to be accessed at a character  
level and the list of integers form for when they do?

thanks, ke han