[erlang-questions] Strings as Lists
Lev Walkin
vlm@REDACTED
Wed Feb 13 02:32:44 CET 2008
Robert Virding wrote:
> I think it all boils down to what you are going to *do* with these
> strings. If you are just going to store them somewhere for later then
> converting them to a binary definitely save space. If, however, you are
> going to *work* with them then having them as lists is definitely much
> better. It is so much easier than having fixed sequence of octets. Also
> most, if not all declarative languages functional and logic, have very
> optimised list handling because lists are so practical to work with.
>
> As mentioned in the next mail you can also keep them as iolists while
> processing to make it efficient to send the strinigs into the big wide
> world. This is sort best of both worlds.
>
> Also having them as lists means you get UTF-16 and 32 for free, and most
> of your libraries still work straight out of the bag. This, UTF-16/32, I
> think will become much more important in the future when the number of
> internet users who don't have a latin charset as their base increases.
> Think of the influence of a few hundred million indians and chinese who
> want 32 bit charsets. :-)
Small correction: UTF-16 and UTF-32 are practically dead, you certainly
need to think in terms of UTF-8 nowadays.
> Robert
>
> On 12/02/2008, *Masklinn* <masklinn@REDACTED
> <mailto:masklinn@REDACTED>> wrote:
>
>
> On 12 Feb 2008, at 17:19 , tsuraan wrote:
>
> > Why does erlang internally represent strings as lists? In every
> > language
> > I've used other than Java, a string is a sequence of octets, just
> like
> > Erlang's binary type. I know that you can represent a string
> > efficiently by
> > using <<"string">> rather than just "string", but why doesn't erlang
> > do this
> > by default? Is it just because pre-12B binary handling wasn't as
> > efficient
> > as list handling, or is Erlang intended to support UTF-32?
> >
>
> A lot of functional languages represent strings as lists rather than
> arrays (Haskell also does that, and only recently got bytestrings)
> because lists are their basic collection datatype (due to being a
> recursive structure and everything), and this allows the use of all
> the list-related functions on strings.
>
> Representing strings as arrays of bytes or character (which is pretty
> much also what Java does, by the way) is an attribute of imperative
> languages whose basic collection datatype is the array.
>
> My guess is that's the reason why: a lot of string operations were
> already implemented on lists (reduces code duplication) and string
> efficiency wasn't really of importance in the erlang world until
> fairly recently, so strings being represented as lists of integers
> wasn't much of a problem.
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED <mailto:erlang-questions@REDACTED>
> http://www.erlang.org/mailman/listinfo/erlang-questions
>
>
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://www.erlang.org/mailman/listinfo/erlang-questions
More information about the erlang-questions
mailing list