[erlang-questions] Strings as Lists

Zvi exta7@REDACTED
Wed Feb 13 05:00:16 CET 2008

Hi Christian,

Christian S wrote:
> I would ask "Why do some programmers store their large text-masses as
> lists?"
> Of course, I know the answer already; because there is a 'string'
> module that operates on lists as strings. Lazy buggers.

Still there is a need for standard string datatype, which will be good for
90% of uses and it should be accepted by all standard libs.
I reperesent strings as binaries, and my code become much more verbose
(almost unreadable), i.e using:
* <<"ABC">>, instead of "ABC"
* <<S1/bytes,S2/bytes>> instead of S1++S2
* using file:delete(binary_to_list(Filename)) instead of
* xmerl and erlsom parse into lists and not binaries (I heard about expat
port, which can parse binary XML, but I don't know how to extract it's code
out of ejabberd).

Christian S wrote:
> - list of words and a word-dictionary (features quicker scanning of
> ...words, efficient storage too)

I want to implement something like this, but using atoms for words. Is this
a good idea?
There is a limit to number of atoms in VM (I think ~1M). I can preload lists
of atoms-per-word and then use only list_to_existing_atom ...
I'll have around 100000 words/atoms. Do you think that it's much better to
use ets with integer word IDs mapped to binaries?

Christian S wrote:
> For the scanning of protocols, I have been looking at Ragel as a tool
> to create C-code FSMs as a loadable driver that recognizes tokens and
> sends these tokens to the port owner process. The port owner in turn
> feeds the port binary chunks, since incremental parsing isnt much of a
> problem for state machines.

How Ragel is better, than other lexical analysers? Do you use it primarily
because it's parsing binary input, why Erlang leexer working with lists?


View this message in context: http://www.nabble.com/Strings-as-Lists-tp15436835p15448906.html
Sent from the Erlang Questions mailing list archive at Nabble.com.

More information about the erlang-questions mailing list