<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<meta content="text/html;charset=ISO-8859-1" http-equiv="Content-Type">
<title></title>
</head>
<body bgcolor="#ffffff" text="#000000">
Richard Carlsson wrote:
<blockquote cite="mid:47B1D0AE.5030803@it.uu.se" type="cite">
<pre wrap="">tsuraan wrote:
</pre>
<blockquote type="cite">
<pre wrap="">Why does erlang internally represent strings as lists? In every
language I've used other than Java, a string is a sequence of octets,
just like Erlang's binary type. I know that you can represent a string
efficiently by using <<"string">> rather than just "string", but why
doesn't erlang do this by default? Is it just because pre-12B binary
handling wasn't as efficient as list handling, or is Erlang intended to
support UTF-32?
</pre>
</blockquote>
<pre wrap=""><!---->
Strings as lists is simple and flexible (i.e., if you already have lists,
you don't need to add another data type). Functions that work on lists,
such as append, reverse, etc., can be used directly on strings; you
don't need to program in different styles if you're traversing a list
or a string; etc. </pre>
</blockquote>
This is only true for ASCII text ;) Non-ASCII gets screwed up badly:<br>
<br>
lists:reverse("text") %% gives you "txet"<br>
lists:reverse("текст") %% Russian for text becomes
[130,209,129,209,186,208,181,208,130,209] which is clearly not what I
wanted :)<br>
</body>
</html>