[erlang-questions] Printed list to Erlang function

Thu Mar 31 00:43:39 CEST 2016

On 31/03/16 9:11 am, Hynek Vychodil wrote:
> On Wed, Mar 30, 2016 at 8:23 PM, Jason Douglas <jasond@REDACTED 
> <mailto:jasond@REDACTED>> wrote:
>
>     At very large sizes though, wouldn’t the map (with O(1) access)
>     become faster than the trie built by the compiler (which will be
>     proportional to lookup term length, when sufficiently dense)?
>
Suppose you are looking up a key of length L in a map, and that everything
goes right.  The hash function will (if it is a *good* one) examine 
every part
of the key, so hashing will take O(L).  Assuming everything goes right, it
will take O(1) time to find the right bucket and there will be only one item
in it, but it will take another O(L) to confirm that you have indeed 
found the
right key.  So maps are *NOT* O(1), they are O(L).

What about a trie?  A trie over a binary alphabet takes O(1) time per
element of the key, for a total of O(L).  The constant factor could go
either way depending on the implementation.

But of course this *isn't* a binary alphabet.  Let B be the average
branching factor of the trie; the cost will be O(L.B) with a linear
search at each node, O(L.log(B)) with a binary search at each node,
or O(L) with some sort of local hashing.  And this depends on exactly
what the compiler does, and may change from release to release.

For the record, I never said that the trie would beat a map.
I said it would beat lists:member/2.

Concerning good hash functions, see
"Hashing in Smalltalk: Theory and Practice" by Andrés Valloud.
It's not just the best book I know that explains how to actually
construct and evaluate hash functions for a variety of data
structures, it's the *only* one.  (I have a paper about how to
compute hash functions of sets and dictionaries; we can do a
lot better than the simplistic combine-using-plus-or-xor that's
usually done.  Haven't found a journal to take it yet.)