[erlang-questions] Strings as Lists

Tue Feb 12 17:45:07 CET 2008

On 12 Feb 2008, at 17:19 , tsuraan wrote:

> Why does erlang internally represent strings as lists?  In every  
> language
> I've used other than Java, a string is a sequence of octets, just like
> Erlang's binary type.  I know that you can represent a string  
> efficiently by
> using <<"string">> rather than just "string", but why doesn't erlang  
> do this
> by default?  Is it just because pre-12B binary handling wasn't as  
> efficient
> as list handling, or is Erlang intended to support UTF-32?
>

A lot of functional languages represent strings as lists rather than  
arrays (Haskell also does that, and only recently got bytestrings)  
because lists are their basic collection datatype (due to being a  
recursive structure and everything), and this allows the use of all  
the list-related functions on strings.

Representing strings as arrays of bytes or character (which is pretty  
much also what Java does, by the way) is an attribute of imperative  
languages whose basic collection datatype is the array.

My guess is that's the reason why: a lot of string operations were  
already implemented on lists (reduces code duplication) and string  
efficiency wasn't really of importance in the erlang world until  
fairly recently, so strings being represented as lists of integers  
wasn't much of a problem.