[erlang-questions] Trie Data Structure

Thu Dec 30 15:54:47 CET 2010

The trie:test_speed/0 function is what I used for determining how fast the
trie was for lookups, when compared to a dict.  The wordlist used comes from
the Ubuntu 10.04 installation and is located at /usr/share/dict/words (98569
words with an average of 9.5 characters per word).  You can also specify
your own wordlist with test_speed/1.  The comparison stores all the words
and does lookups on all the words (using is_key, fetch, and find) while
tracking the time taken to do all the lookups.

On Wed, Dec 29, 2010 at 9:04 PM, Ryan Zezeski <rzezeski@REDACTED> wrote:

> On Wed, Dec 29, 2010 at 9:21 PM, Michael Truog <mjtruog@REDACTED> wrote:
>
>> Hi,
>>
>> I implemented a trie data structure that has performed 20% faster lookups
>> than the dict implementation in R14B01.  I have tested the code but it can
>> always benefit from more testing.  The trie module provides the same
>> interface as the dict module, but requires a list of integers for the key.
>> Please tell me if you find issues in the code.
>>
>> Code:
>>
>> https://github.com/okeuday/CloudI/blob/master/src/lib/cloud_stdlib/src/trie.erl
>>
>> Blog posts:
>> http://okeuday.livejournal.com/16454.html
>> http://okeuday.livejournal.com/16182.html (dict as the fastest general
>> tree
>> data structure)
>>
>> I know you can use ETS for similar problems, but avoiding shared data
>> helps
>> utilize Erlang's scalability advantages.
>>
>> Thanks,
>> Michael Truog
>>
>
> Do you have the benchmark and dataset available as well?  I wrote a trie in
> Erlang a few weeks ago and I'm just curious how it stacks up.
>
> --
> *.........01010...00..00...0100...00..00.
> .........10..11...0100...11..00..101.01.
> .111010..01100.....10....101110..01.101.
> .........10..10....11....01..10..10..01.
> .........00..00....00....00..00..00..00.
> ........................................*
>