[erlang-questions] Trie Data Structure

Michael Truog mjtruog@REDACTED
Sun Jan 2 22:31:56 CET 2011


Actually, use the benchmark code here:
https://github.com/okeuday/erlbench

Blog Post:
http://okeuday.livejournal.com/16707.html

The trie fetch performance is roughly 15% faster than a dict for 10000
entries, but might get better as the number of entries increases (though the
performance depends on the distribution of the key size, i.e., string
length).

On Thu, Dec 30, 2010 at 6:54 AM, Michael Truog <mjtruog@REDACTED> wrote:

> The trie:test_speed/0 function is what I used for determining how fast the
> trie was for lookups, when compared to a dict.  The wordlist used comes from
> the Ubuntu 10.04 installation and is located at /usr/share/dict/words (98569
> words with an average of 9.5 characters per word).  You can also specify
> your own wordlist with test_speed/1.  The comparison stores all the words
> and does lookups on all the words (using is_key, fetch, and find) while
> tracking the time taken to do all the lookups.
>
>
> On Wed, Dec 29, 2010 at 9:04 PM, Ryan Zezeski <rzezeski@REDACTED> wrote:
>
>> On Wed, Dec 29, 2010 at 9:21 PM, Michael Truog <mjtruog@REDACTED> wrote:
>>
>>> Hi,
>>>
>>> I implemented a trie data structure that has performed 20% faster lookups
>>> than the dict implementation in R14B01.  I have tested the code but it
>>> can
>>> always benefit from more testing.  The trie module provides the same
>>> interface as the dict module, but requires a list of integers for the
>>> key.
>>> Please tell me if you find issues in the code.
>>>
>>> Code:
>>>
>>> https://github.com/okeuday/CloudI/blob/master/src/lib/cloud_stdlib/src/trie.erl
>>>
>>> Blog posts:
>>> http://okeuday.livejournal.com/16454.html
>>> http://okeuday.livejournal.com/16182.html (dict as the fastest general
>>> tree
>>> data structure)
>>>
>>> I know you can use ETS for similar problems, but avoiding shared data
>>> helps
>>> utilize Erlang's scalability advantages.
>>>
>>> Thanks,
>>> Michael Truog
>>>
>>
>> Do you have the benchmark and dataset available as well?  I wrote a trie
>> in Erlang a few weeks ago and I'm just curious how it stacks up.
>>
>> --
>> *.........01010...00..00...0100...00..00.
>> .........10..11...0100...11..00..101.01.
>> .111010..01100.....10....101110..01.101.
>> .........10..10....11....01..10..10..01.
>> .........00..00....00....00..00..00..00.
>> ........................................*
>>
>
>


More information about the erlang-questions mailing list