Strings (was: Re: are Mnesia tables immutable?)

Romain Lenglet rlenglet@REDACTED
Wed Jun 28 12:45:07 CEST 2006


ke han wrote:
> On Jun 28, 2006, at 2:47 PM, Romain Lenglet wrote:
> > Personally, I am voting for (1) representing strings as
> > lists of Unicode code points, but (2) providing a better
> > (more flexible, more efficient) external representation, and
> > most importantly (3) providing a more flexible interface to
> > the external encoding/decoding primitives, such as
> > supporting strings as tuples as above.
>
> I don't care about the internal representation of string so
> long as its (a) _significantly_ more memory efficient than one
> word per character in a list and (b) allows me to pass these 
> non-mutable strings between processes without a mem copy each
> time.
>
> My end game is writing web apps in erlang+yaws+mnesia.

What we were discussing is how to internally represent, and 
externally encode (in the term_to_binary/1 sense), strings in a 
form suitable for building or modification by programs. You are 
discussing about the need to pass around strings that are 
already 8-bit encoded and that don't need to be modified. 
Different problems. Different representations.

> The basic result of any yaws page (or any dynamic html server)
> is to output a sequence of terms into a stream the browser is
> expecting. This means the following concatenation or list of
> "strings"  is common in streaming out a page:
>
> Header + StaticWebPagePreamble +
> StaticContentSuchAsLabelsLookedUpByUsersLangPref +
> HTMLInputControl + ContentForInputControl + ...  +
> HTMLSelectControl +
> ContentForSelectControl + StaticWebPageFooter
[...]

Since you don't seem to need to modify the contents of those 
strings, why don't IO-lists (i.e. a list of binaries) fit your 
need? You should simply pass a list of binaries, where each 
binary contains text is 8-bit encoded in UTF-8 or ISO-8859-1 or 
whatever. Binaries are not copied. Such IO-lists are what is 
used to communicate with linked-in C drivers. IO-lists are the 
most efficient way to transmit large data in an Erlang node.
Why doesn't that fit your needs?

-- 
Romain LENGLET



More information about the erlang-questions mailing list