[erlang-questions] Strings as Lists

Christian S chsu79@REDACTED
Tue Feb 12 19:21:45 CET 2008


On Feb 12, 2008 5:45 PM, Masklinn <masklinn@REDACTED> wrote:
> My guess is that's the reason why: a lot of string operations were
> already implemented on lists (reduces code duplication) and string
> efficiency wasn't really of importance in the erlang world until
> fairly recently, so strings being represented as lists of integers
> wasn't much of a problem.

That is the reason. Hysterical Raisins.

There was a time when Erlang didn't have binaries. Someone thought it
would be a good idea to make "ABC" a way to write [65,66,67]. If you
look at the old "eddie" web load balancer project you see the dns
protocol being decoded using lists. The "" syntax for lists is a
pragmatic solution to make code more readable when you need ascii
sequences in your lists.

I would not ask "Why does erlang internally represent strings as lists?".

Erlang does not have strings. It has a shorthand syntax for creating
lists. If you still consider "ABC" to be a string, then the list is
certainly not an "internal representation". Go ahead and treat it as a
list.

I would ask "Why do some programmers store their large text-masses as lists?"

Of course, I know the answer already; because there is a 'string'
module that operates on lists as strings. Lazy buggers.

Alternative ways to handle larger text-masses:
- binaries (features representation that is 1:1 with the character
encoding itself, now also (R12B) with efficient scanning and
tail-construction)
- iolists (features cheap concatenation of large texts)
- list of words and a word-dictionary (features quicker scanning of
...words, efficient storage too)

It all comes down to what you really are doing with your large texts.

PS.

For the scanning of protocols, I have been looking at Ragel as a tool
to create C-code FSMs as a loadable driver that recognizes tokens and
sends these tokens to the port owner process. The port owner in turn
feeds the port binary chunks, since incremental parsing isnt much of a
problem for state machines.

Of course, I have only reached so far as to teach myself Ragel and
realizing that it is still easy to make mistakes. It would be nice
with a Ragel that produces erlang code.

Have anyone else experimented with Ragel? I know that angry-ruby-guy
used it for Mongrel, that is how I found out about it.



More information about the erlang-questions mailing list