[erlang-questions] Question/Alternative on Frames Proposal [Warning: Long]
Richard O'Keefe
ok@REDACTED
Tue May 22 00:42:53 CEST 2012
On 22/05/2012, at 12:15 AM, Garrett Smith wrote:
>>
>> I'm not sure what you are asking. The paper that you mentioned made it
>> quite clear that they were NOT experimenting with frames or anything in
>> the same area of design space.
>
> Richard, I'm not seeing what you're seeing.
>
> That's a PDF of presentation slides from 2011 User Conf-- it lists
> "frames" in the title and as a input to their process.
We're talking about
http://www.erlang-factory.com/upload/presentations/468/EUC_Hashes2011.pdf
and that's all I have to go on.
Slide 0: title
Slide 1: there's always room for improvements RECORDS Hash maps
Slide 2: outline of records (few named fields, many instances, matchable)
Slide 3: not a distinct data type, not suitable for many fields
Slide 4: outline of hash maps (proplists, dict, gb_trees, gb_sets)
Slide 5: what's wrong with proplists
Slide 6: what's wrong with dict
Slide 7: Requirements/Goals ***FOR HASHMAPS***
And it is quite clear from content slide 7 that they are
talking about something that can do the job of dict and
gb_tree; suitability for type checking is NOT listed.
More precisely, content slide 7 gives the impression of
a vague desire for something that's a floor polish AND
a desert topping, a bulk hashed container AND a possible
replacement for records.
Slide 8: mentions me, Joe Armstrong, Anton Lavrik, Phil Bagwell.
Slide 9: Erlson syntax.
Slide 10: >>> We have implemented prototypes and performed measurements
using the Vlists and HAMT datastructures internally in the
Erlang VM <<<
It is very very clear from this that the experiments
concerned something suitable for *LARGE* data structures.
Slide 11: Diagram of records as tuples.
Slide 12: Diagram of VLists / vhash lists.
Slide 13: Diagram of Hash Array Mapped Tries.
Popcount is one of those things that *ought* to be fast but
that often get left out of actual chips. POPCNT wasn't on
PCs until Core i7, I believe. It's in the SPARC V9
architecture, but is not in all SPARC V9 _hardware_, like
the machine on my desk.
Slide 14: Diagram of hash table with exception list
Slide 15: Another such diagram
This appears to be an adaptation of a 1980s technique for
O(1) array update; the latest version is compact and in
place while old versions have a list of deltas.
Slide 16: Graph of time in microseconds against number of elements
inserted for dict, gb_trees, vlist_hash, and hamt (up to ~ 30)
Slide 17: Another such (up to ~ 65 000)
Slide 18: Graph of lookup time (up to ~ 30)
Slide 19: Another such (up to ~ 65 000)
Slide 20: Graph of memory (up to ~ 30 elements)
These data structures take from 4 to 8 words per element.
Frames, in contrast, take 1 to 2 words.
Slide 21: Another such graph (up to ~ 65 000 elements)
Slide 22: Table of requirements by data structures.
Frames are conspicuous by their absence.
Slide 23: "new better records" "and Hashmaps" "ARE TWO DIFFERENT THINGS."
I agree with this distinction and made it a decade ago.
This talk was about Hashmaps, NOT about anything credible as
new better records.
It will be really good for Erlang to have a better implementation of
hashmaps. This was good work. It owes nothing to me; I'd never even
heard of Hash Array Mapped Tries before. I look forward to seeing
something coming of it.
But there is NOTHING reported in that presentation that evaluates
alternatives for "new better records".
> In any event, is not relevant to any discussion here what the OTP team
> is actually working on? Is this is a blip of interest in a decade old
> proposal that has no shot of getting into the language?
What's there is a solid examination of "can we provide better support
for functionally pure persistent dictionaries than what we've already
got" that briefly drags a "new better records" red herring over the
track at the beginning and then drops it. It expresses the _wish_
that one data structure might be usable for both purposes, and I
suppose you could argue that slide 20 does *implicitly* demonstrate
that the current set of candidates for better dictionaries are *NOT*
credible candidates for better records.
To be fair, there's no reason why a single logical data structure
has to have a *uniform* representation at run time (as I was saying
to a data base class yesterday). One could use something like frames
up to a certain size and something else above that size.
one data structure _might
More information about the erlang-questions
mailing list