String representation in erlang

Thinus Pollard thinus@REDACTED
Tue Sep 13 17:35:52 CEST 2005


Thanks for all the help. You live, you learn.... well mostly you live ;)

It was an interesting exercise anyhow. Still getting my head around this after 
a few years of procedural programming.

Maybe someone can use it in an RSA implementation somewhere, I've seen Joe 
mention.

Thinus

On Tuesday 13 September 2005 16:17, Shawn Pearce wrote:
> Interesting - but how is this better than a binary?
>
> If I recall the source code correctly any binary using less than
> 255 words is stored on the process heap; larger binaries are
> allocated in a shared heap (to reduce message passing costs).
> This of course means that any "string" stored in a binary would
> require 8 + NumberOf8bitChars bytes of memory (rounded up to the
> next full word).  If NumberOf8BitChars is < (255 * 4 = 1020) then
> it will be allocated on the private heap.
>
> Further binaries can be easily pattern matched in function headers
> and are already handled by the io library; this packed string
> representation is more difficult to pattern match against and isn't
> directly handled by the io library functions.
>
> Thinus Pollard <thinus@REDACTED> wrote:
> > Hi there
> >
> > According to the Erlang efficiency guide a string is internally
> > represented as a list of integers, thus consuming 2 words (8 bytes on a
> > 32bit platform) of memory *per* character.
> >
> > The attached code is an attempt at reducing the memory footprint of
> > strings in erlang (passing between functions etc etc).
> >
> > The basic idea is to pack a string into n byte sized integers and
> > unpacking them on the other side. The text file called compare.txt also
> > shows the memory needed to represent strings in normal erlang strings and
> > this string packing.
> >
> > Normal erlang strings are 2 words/character. The packed representation
> > uses 1 word of memory per list element plus n bytes/wordsize per integer
> > element, where every integer element contain n characters.
> >
> > Deficiencies:
> > If the string length is not divisible by n, space is wasted (the string
> > gets padded with zeros).
> >
> > Usage:
> > Pick your the integer representation length.
> > packstring/1 takes a string returns a list of n byte integers
> > unpackstring/1 takes an integer representation and returns a string.
> >
> > There is a simple test suite in test/0.
> >
> > If anyone can improve upon this code, please do. If this was an exercise
> > in futility, please let my know, I've only been programming erlang for 2
> > weeks and still need to learn all the gotchas ;)
> >
> > --
> >
> > Thinus Pollard

-- 
Thinus Pollard

Mobile: +27 72 075 2751



More information about the erlang-questions mailing list