[erlang-questions] Wanted additions to the maps module?

Wed May 11 17:40:38 CEST 2016

On Wed, May 11, 2016 at 09:48:41AM +0000, Grzegorz Junka wrote:
> 
> On 11/05/2016 08:55, Raimo Niskanen wrote:
> > On Tue, May 10, 2016 at 02:30:19PM +0000, Grzegorz Junka wrote:
> >> In my test I am adding the memory consumed by the process and in ETS:
> >> process_size(E) + ets:info(Tid, memory) * erlang:system_info(wordsize).
> >> It's hard but one need to start somewhere. It's not enough to say, it's
> >> hard, don't do it.
> > I did not say "don't do it".  I said it is not a fair comparision,
> > with the underlying assumption that you only compared heap usage.
> >
> > But since you apparently have tried to compensate for that unfairness
> > you just might have a fairly fair comparision.
> >
> > As Björn-Egil pointed out, though, process_info(self(), memory) might not
> > be a good metric without a garbage_collect() preceeding it, depending on
> > the application you benchmark for.
> >
> > You do not know for certain that the final process size is the maximum
> > during the test, depending on when garbage collections are triggered.  Have
> > a look at erlang:system_monitor(Pid, [{large_heap,Size}]).  It might be
> > relevant in this benchmark.
> 
> I am not that interested in the size of the process containing the data 
> structure after it has been garbage collected. I want to know how much 
> garbage the data structure is generating because this directly affects 
> the size of the VM when many of such processes are running in parallel. 
> I don't know of a fairer way of measuring that than by getting the size 
> of the process straight after adding and reading the specified amount of 
> keys.

Then I say that a fairer way to know how much garbage a process creates is
to use erlang:trace(WorkerProcess, true,
                    [garbage_collection,{tracer,TracerProcess}]).

The messages from that feature can tell you how much garbage is being
reclaimed.

> 
> Sure, the size of the process may shrink at some later time, but in a 
> real application when I don't call gc purposefully the size of the 
> process may also stay larger for a longer period of time before gc 
> shrinks it.

A possible problem is that the process may have just recently garbage
collected for some benchmark cases but not for others, quite randomly,
before your size measurement.

> 
> Please note that there is no deleting of key-value pairs, only adding 
> and reading. So any garbage generated is caused by changes to the 
> internal representation of the table holding the growing amount of 
> elements. The allocator can't reuse memory chunks released by elements 
> deleted previously. This may not be applicable to all real-life 

It can on the contrary reuse such memory chunks.

Tuple example:
* We start with a tuple of 1000 elements, and a list of 30000 elements to
  insert in the tuple.  Each element uses 3 words plus 2 in the list cons
  cell that is 5 * 30000 words plus the containing tuple of 1001 words
  that is 151001 words.  The nearest above in the Fibonacci series that
  the heap algorithm uses is 189886 words.
* We then add one element to the tuple. We only need to allocate 1002 words
  for the new containing tuple, and we have 189886 - 151001 = 38885 so that
  is no problem.
* After creating the new containing tuple the old one is garbage not yet
  noticed.  We continue to add elements and therefore allocate new
  containing tuples of 1003, 1004, ... words, so at about thirty added
  elements we run out of heap and allocate a new heap of 307242 (Fibonacci,
  again) words.
* After creating the containing tuple that caused the new heap allocation
  we garbage collect and then find out that all but the last tuple
  was needed so we have allocated 307242 but only use 5 * 30000
  + 1030 = 151030 words. So now we can go on with around (307242 - 151030)
  / 1030 ~= 150 element insertions before the next heap allocation where we
  again will allocate the next size 497128 and only use about 5 * 30000 +
  1180 = 151180 words.  That is 497128 / 151180 ~= 3.3 times more than
  needed.
* If the garbage collect causes the new heap to be less than one fourth
  of the old heap size the new heap may be _shrunk_.

So depending on if you happen to stop your iteration just before or just
after a garbage collection, maybe a shrinking one, you will get totally
different heap sizes.

To furthermore complicate the situation much of your data for example the
list you start with will survive two garbage collections and therefore be
stored on the old-heap.  This also applies to the parts of every tree data
structure you benchmark that does not change often.  This throws all
numbers in my tuple example above out the window.

Try increasing Accesses to 300000, and to make the test complete in a
reasonable amount of time remove list and orddict from Types, and you will
get completely different results of your benchmark.  We get for example
_negative_ results for dict and map looks like a winner.  This kind of
wobbling result depending on the scale of the test is a reliable sign of an
unreliable benchmark...

> scenarios but should more accurately reflect the behaviour where the 
> data structure holds many key-value pairs and the application is rarely 
> changing the existing but is mostly adding new key-value pairs.
> 
> Grzegorz

-- 

/ Raimo Niskanen, Erlang/OTP, Ericsson AB