[erlang-questions] Strings as Lists

Robert Virding rvirding@REDACTED
Tue Feb 12 23:13:02 CET 2008

I think it all boils down to what you are going to *do* with these strings.
If you are just going to store them somewhere for later then converting them
to a binary definitely save space. If, however, you are going to *work* with
them then having them as lists is definitely much better. It is so much
easier than having fixed sequence of octets. Also most, if not all
declarative languages functional and logic, have very optimised list
handling because lists are so practical to work with.

As mentioned in the next mail you can also keep them as iolists while
processing to make it efficient to send the strinigs into the big wide
world. This is sort best of both worlds.

Also having them as lists means you get UTF-16 and 32 for free, and most of
your libraries still work straight out of the bag. This, UTF-16/32, I think
will become much more important in the future when the number of internet
users who don't have a latin charset as their base increases. Think of the
influence of a few hundred million indians and chinese who want 32 bit
charsets. :-)


On 12/02/2008, Masklinn <masklinn@REDACTED> wrote:
> On 12 Feb 2008, at 17:19 , tsuraan wrote:
> > Why does erlang internally represent strings as lists?  In every
> > language
> > I've used other than Java, a string is a sequence of octets, just like
> > Erlang's binary type.  I know that you can represent a string
> > efficiently by
> > using <<"string">> rather than just "string", but why doesn't erlang
> > do this
> > by default?  Is it just because pre-12B binary handling wasn't as
> > efficient
> > as list handling, or is Erlang intended to support UTF-32?
> >
> A lot of functional languages represent strings as lists rather than
> arrays (Haskell also does that, and only recently got bytestrings)
> because lists are their basic collection datatype (due to being a
> recursive structure and everything), and this allows the use of all
> the list-related functions on strings.
> Representing strings as arrays of bytes or character (which is pretty
> much also what Java does, by the way) is an attribute of imperative
> languages whose basic collection datatype is the array.
> My guess is that's the reason why: a lot of string operations were
> already implemented on lists (reduces code duplication) and string
> efficiency wasn't really of importance in the erlang world until
> fairly recently, so strings being represented as lists of integers
> wasn't much of a problem.
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://www.erlang.org/mailman/listinfo/erlang-questions
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20080212/0c8c0392/attachment.htm>

More information about the erlang-questions mailing list