<div class="gmail_quote">Appreciate the explanation.</div><div class="gmail_quote"><br></div><div class="gmail_quote">On Mon, Apr 25, 2011 at 5:18 PM, Jesper Louis Andersen <span dir="ltr"><<a href="mailto:jesper.louis.andersen@gmail.com">jesper.louis.andersen@gmail.com</a>></span> wrote:<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;"><div class="im">On Mon, Apr 25, 2011 at 11:16, Icarus Alive <<a href="mailto:icarus.alive@gmail.com">icarus.alive@gmail.com</a>> wrote:<br>


<br>

> 1. Found the following usage of tuple --<br>

>        { person, { name, joe }, { age, 42 } }<br>

>     Now when one has millions on such tuples, which are held in-memory,<br>

> isn't the memory footprint of the application at-least 3  x 8-byte<br>

> extranuous thanks to the 3 atoms (person, name, age), playing "field names"<br>

> ? Is this a concern for anyone ? Is there are more efficient storage ?<br>

<br>

</div>It is actually far worse than 3 * 8 bytes as the representation is<br>

boxed everywhere. So each of the tuples are also taking up memory.<br>

There are some ways to limit the memory usage though:<br>

<br>

a) Use records. If we have a record #person { name = joe, age = 42},<br>

the internal representation is {person, joe, 42} which avoids a lot of<br>

the tupling overhead.<br></blockquote><div><br></div><div>Okay, quickly jumping ahead and reading record, looks like it will give a saving of at least 2 atom's per tuple, still a 8Byte cost to pay to be able to achieve some level of introspection.</div>

<div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">b) Use the halfword emulator. The process-local representation would<br>

in that case be much better.<br>

<br>

c) If you store millions of these tuples, chances are you want to use<br>

ETS for it. ETS takes a "compressed" option which trades speed for<br>

space.<br></blockquote><div><br></div><div>When I send "held in-memory" it was for purposes like HTTP session context, or a proprietary TCP based protocol's session context. Each of those contexts is in active use. So the "person" tuple was probably not a good example. In such a case where 1 single Erlang process having million active tuples, or multiple (millions?) of Erlang processes each with such a tuple, would it still be a candidate for ETS ?</div>

<div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">d) If you have rather rare access to the data, you can store them in a<br>

binary format which take up less space.<br></blockquote><div><br></div><div>As in programmatically turn the Erlang representation into a Blob (of sorts) ? Does Erlang come with some batteries for this purpose, or some recommended best-practices, or it is completely DIY ?</div>

<div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">Is it a concern? In some cases yes, but note that buying more memory<br>

is a relatively cheap thing and that you have the above options to<br>

counter it. </blockquote><div><br></div><div>Okay, that makes sense.</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">My main concern is that the internal representation of<br>


this, heavily boxed as it is, costs memory lookups - which then cost<br>

execution speed compared to an unboxed representation. </blockquote><div><br>In this particular context, do we mean that...</div><div> { person, { name, joe }, { age, 42 } }               -- is boxed representation, and</div>

<div> { person, joe, 42 }                                        -- is unboxed repsentation ?</div><div><br></div><div>or</div><div>  | $j,$o,$e | 42.0000 |  (as 32 bytes in memory) -- is unboxed (internal) representation ?</div>

<div><br></div><div>I.e. what exactly are we calling the boxed / unboxed representation in this context.</div><div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">

My concern<br>

would be with the cache, not the total memory use.<br></blockquote><div><br></div><div>Agreed. This is indeed the biggest concern, but I've seen far too many Java / J2EE applications where system-memory becomes a constraint far before CPU is even @ 25% of saturation, especially where we want to avoid swapping. Does anyone see that a possibility here with Erlang as well ?</div>

<div><br></div></div>