[erlang-questions] Strings usage caveats

Robert Raschke <>
Mon Mar 5 18:44:03 CET 2012


With Unicode, a string becomes a list of code points! ASCII is a thing of
the past.
See http://www.erlang.org/doc/man/unicode.html

But always remember, strings in general are not as easy as you think.

Robby

2012/3/5 Matthew Evans <>

>  In a nut-shell a string in Erlang is represented as a list of (ASCII)
> characters. So "hello" becomes: [104,101,108,108,111].
>
> This has many advantages in terms of been able to process strings. But
> there are problems:
>
> 1)  It can take up lots of memory. A list is 1 word + 1 word for each
> element + size of element. So "hello" would (on a 64 bit machine) be 53
> bytes.
>
> 2) Many of the modules are implemented in Erlang (rather than in a BIF).
> Doing extensive string manipulation this way *could* be slow (when compared
> to C or other languages.
>
> Fortunately if you need performance you can represent strings as binaries
> (I personally think that we should be thinking strings as binaries all the
> time now).
>
> So the string "hello" would become <<"hello">> as a binary. The memory
> efficiency is much better than with lists (for anything over a few 10's of
> bytes it's pretty much "native" size - there is a small overhead IRC).
> Better still you can use the very fast binary module to do much of the
> processing. That with binary comprehensions and binary pattern matching
> allows you to buil powerful applications.
>
> Personally I've refactored much of my "string handling" code to use
> binaries now. What would be nice is for the "re" and "string" modules to
> allow binaries and lists as input.
>
> Matt
>
> > Date: Mon, 5 Mar 2012 17:49:48 +0400
> > From: 
> > To: 
> > Subject: [erlang-questions] Strings usage caveats
>
> >
> > Hello all,
> >
> > I study Erlang strings usage in production. In
> > doc/efficiency_guide/myths.html there is a paragraph that say
> > "Actually, string handling could be slow if done improperly. In
> > Erlang, you'll have to think a little more about how the strings are
> > used and choose an appropriate representation and use the re module
> > instead of the obsolete regexp module if you are going to use regular
> > expressions."
> >
> > I have a very poor experience in programming in Erlang/OTP so that
> > sentence was rather abstract for me. I suppose that the root of the
> > problems with strings is in variables immutability and thus a copying
> > of the whole source string in case of its modification. But it seems
> > to me that it's not that all.
> >
> > Can you please supply me with the sources to read or examples and
> > hints about strings performance in Erlang.
> >
> > --
> > Александр Винокуров
> > +7 (921) 982-21-43
> > @aleksandrvin
> > _______________________________________________
> > erlang-questions mailing list
> > 
> > http://erlang.org/mailman/listinfo/erlang-questions
>
> _______________________________________________
> erlang-questions mailing list
> 
> http://erlang.org/mailman/listinfo/erlang-questions
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20120305/8bb86d5c/attachment.html>


More information about the erlang-questions mailing list