[erlang-questions] Strings usage caveats

Matthew Evans mattevans123@REDACTED
Mon Mar 5 18:01:39 CET 2012


In a nut-shell a string in Erlang is represented as a list of (ASCII) characters. So "hello" becomes: [104,101,108,108,111].
This has many advantages in terms of been able to process strings. But there are problems:
1)  It can take up lots of memory. A list is 1 word + 1 word for each element + size of element. So "hello" would (on a 64 bit machine) be 53 bytes.
2) Many of the modules are implemented in Erlang (rather than in a BIF). Doing extensive string manipulation this way *could* be slow (when compared to C or other languages.
Fortunately if you need performance you can represent strings as binaries (I personally think that we should be thinking strings as binaries all the time now).
So the string "hello" would become <<"hello">> as a binary. The memory efficiency is much better than with lists (for anything over a few 10's of bytes it's pretty much "native" size - there is a small overhead IRC). Better still you can use the very fast binary module to do much of the processing. That with binary comprehensions and binary pattern matching allows you to buil powerful applications.
Personally I've refactored much of my "string handling" code to use binaries now. What would be nice is for the "re" and "string" modules to allow binaries and lists as input.
Matt   
> Date: Mon, 5 Mar 2012 17:49:48 +0400
> From: aleksandr.vin@REDACTED
> To: erlang-questions@REDACTED
> Subject: [erlang-questions] Strings usage caveats
> 
> Hello all,
> 
> I study Erlang strings usage in production. In
> doc/efficiency_guide/myths.html there is a paragraph that say
> "Actually, string handling could be slow if done improperly. In
> Erlang, you'll have to think a little more about how the strings are
> used and choose an appropriate representation and use the re module
> instead of the obsolete regexp module if you are going to use regular
> expressions."
> 
> I have a very poor experience in programming in Erlang/OTP so that
> sentence was rather abstract for me. I suppose that the root of the
> problems with strings is in variables immutability and thus a copying
> of the whole source string in case of its modification. But it seems
> to me that it's not that all.
> 
> Can you please supply me with the sources to read or examples and
> hints about strings performance in Erlang.
> 
> --
> Александр Винокуров
> +7 (921) 982-21-43
> @aleksandrvin
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://erlang.org/mailman/listinfo/erlang-questions
 		 	   		  
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20120305/27c586ee/attachment.htm>


More information about the erlang-questions mailing list