[erlang-questions] learner's questions -- tuples, CEAN & jungerl

Mon Apr 25 15:27:06 CEST 2011

Appreciate the explanation.

On Mon, Apr 25, 2011 at 5:18 PM, Jesper Louis Andersen <
jesper.louis.andersen@REDACTED> wrote:

> On Mon, Apr 25, 2011 at 11:16, Icarus Alive <icarus.alive@REDACTED>
> wrote:
>
> > 1. Found the following usage of tuple --
> >        { person, { name, joe }, { age, 42 } }
> >     Now when one has millions on such tuples, which are held in-memory,
> > isn't the memory footprint of the application at-least 3  x 8-byte
> > extranuous thanks to the 3 atoms (person, name, age), playing "field
> names"
> > ? Is this a concern for anyone ? Is there are more efficient storage ?
>
> It is actually far worse than 3 * 8 bytes as the representation is
> boxed everywhere. So each of the tuples are also taking up memory.
> There are some ways to limit the memory usage though:
>
> a) Use records. If we have a record #person { name = joe, age = 42},
> the internal representation is {person, joe, 42} which avoids a lot of
> the tupling overhead.
>

Okay, quickly jumping ahead and reading record, looks like it will give a
saving of at least 2 atom's per tuple, still a 8Byte cost to pay to be able
to achieve some level of introspection.

> b) Use the halfword emulator. The process-local representation would
> in that case be much better.
>
> c) If you store millions of these tuples, chances are you want to use
> ETS for it. ETS takes a "compressed" option which trades speed for
> space.
>

When I send "held in-memory" it was for purposes like HTTP session context,
or a proprietary TCP based protocol's session context. Each of those
contexts is in active use. So the "person" tuple was probably not a good
example. In such a case where 1 single Erlang process having million active
tuples, or multiple (millions?) of Erlang processes each with such a tuple,
would it still be a candidate for ETS ?

d) If you have rather rare access to the data, you can store them in a
> binary format which take up less space.
>

As in programmatically turn the Erlang representation into a Blob (of sorts)
? Does Erlang come with some batteries for this purpose, or some recommended
best-practices, or it is completely DIY ?

> Is it a concern? In some cases yes, but note that buying more memory
> is a relatively cheap thing and that you have the above options to
> counter it.

Okay, that makes sense.

> My main concern is that the internal representation of
> this, heavily boxed as it is, costs memory lookups - which then cost
> execution speed compared to an unboxed representation.

In this particular context, do we mean that...
 { person, { name, joe }, { age, 42 } }               -- is boxed
representation, and
 { person, joe, 42 }                                        -- is unboxed
repsentation ?

or
  | $j,$o,$e | 42.0000 |  (as 32 bytes in memory) -- is unboxed (internal)
representation ?

I.e. what exactly are we calling the boxed / unboxed representation in this
context.

My concern
> would be with the cache, not the total memory use.
>

Agreed. This is indeed the biggest concern, but I've seen far too many Java
/ J2EE applications where system-memory becomes a constraint far before CPU
is even @ 25% of saturation, especially where we want to avoid swapping.
Does anyone see that a possibility here with Erlang as well ?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20110425/67dbbbf3/attachment.htm>