[erlang-questions] Question/Alternative on Frames Proposal [Warning: Long]

Tue May 22 00:42:53 CEST 2012

On 22/05/2012, at 12:15 AM, Garrett Smith wrote:
>> 
>> I'm not sure what you are asking.  The paper that you mentioned made it
>> quite clear that they were NOT experimenting with frames or anything in
>> the same area of design space.
> 
> Richard, I'm not seeing what you're seeing.
> 
> That's a PDF of presentation slides from 2011 User Conf-- it lists
> "frames" in the title and as a input to their process.

We're talking about
http://www.erlang-factory.com/upload/presentations/468/EUC_Hashes2011.pdf
and that's all I have to go on.

Slide  0: title
Slide  1: there's always room for improvements RECORDS Hash maps
Slide  2: outline of records (few named fields, many instances, matchable)
Slide  3: not a distinct data type, not suitable for many fields
Slide  4: outline of hash maps (proplists, dict, gb_trees, gb_sets)
Slide  5: what's wrong with proplists
Slide  6: what's wrong with dict
Slide  7: Requirements/Goals ***FOR HASHMAPS***

	And it is quite clear from content slide 7 that they are
	talking about something that can do the job of dict and
	gb_tree; suitability for type checking is NOT listed.

	More precisely, content slide 7 gives the impression of
	a vague desire for something that's a floor polish AND
	a desert topping, a bulk hashed container AND a possible
	replacement for records.

Slide  8: mentions me, Joe Armstrong, Anton Lavrik, Phil Bagwell.
Slide  9: Erlson syntax.
Slide 10: >>> We have implemented prototypes and performed measurements
	  using the Vlists and HAMT datastructures internally in the
	  Erlang VM <<<

	It is very very clear from this that the experiments
	concerned something suitable for *LARGE* data structures.
Slide 11: Diagram of records as tuples.
Slide 12: Diagram of VLists / vhash lists.
Slide 13: Diagram of Hash Array Mapped Tries.

	Popcount is one of those things that *ought* to be fast but
	that often get left out of actual chips.  POPCNT wasn't on
	PCs until Core i7, I believe.  It's in the SPARC V9
	architecture, but is not in all SPARC V9 _hardware_, like
	the machine on my desk.

Slide 14: Diagram of hash table with exception list
Slide 15: Another such diagram

	This appears to be an adaptation of a 1980s technique for
	O(1) array update; the latest version is compact and in
	place while old versions have a list of deltas.

Slide 16: Graph of time in microseconds against number of elements
	inserted for dict, gb_trees, vlist_hash, and hamt (up to ~ 30)
Slide 17: Another such (up to ~ 65 000)
Slide 18: Graph of lookup time (up to ~ 30)
Slide 19: Another such (up to ~ 65 000)
Slide 20: Graph of memory (up to ~ 30 elements) 

	These data structures take from 4 to 8 words per element.
	Frames, in contrast, take 1 to 2 words.

Slide 21: Another such graph (up to ~ 65 000 elements)
Slide 22: Table of requirements by data structures.

	Frames are conspicuous by their absence.

Slide 23: "new better records" "and Hashmaps" "ARE TWO DIFFERENT THINGS."

	I agree with this distinction and made it a decade ago.

	This talk was about Hashmaps, NOT about anything credible as
	new better records.

It will be really good for Erlang to have a better implementation of
hashmaps.  This was good work.  It owes nothing to me; I'd never even
heard of Hash Array Mapped Tries before. I look forward to seeing
something coming of it.

But there is NOTHING reported in that presentation that evaluates
alternatives for "new better records".

> In any event, is not relevant to any discussion here what the OTP team
> is actually working on? Is this is a blip of interest in a decade old
> proposal that has no shot of getting into the language?

What's there is a solid examination of "can we provide better support
for functionally pure persistent dictionaries than what we've already
got" that briefly drags a "new better records" red herring over the
track at the beginning and then drops it.  It expresses the _wish_
that one data structure might be usable for both purposes, and I
suppose you could argue that slide 20 does *implicitly* demonstrate
that the current set of candidates for better dictionaries are *NOT*
credible candidates for better records.

To be fair, there's no reason why a single logical data structure
has to have a *uniform* representation at run time (as I was saying
to a data base class yesterday).  One could use something like frames
up to a certain size and something else above that size.

one data structure _might