Efficient Functional Data Structures

Eric Newhuis enewhuis@REDACTED
Mon Jan 6 01:28:03 CET 2003


I threw out ETS perhaps prematurely (?) when I realized I couldn't match
on variable-length keys.

In my world keys are lists of Erlang terms.  Can ETS find matches when
some terms are don't-care?  I think the answer is yes.  But can ETS
match on variable-length keys? I don't know the answer.

Given a set of keys such as [a,b,c], [a,b,c,d], and [a,b,c,d,e], can ETS
find all three keys given a search pattern like [a,b,'*'] ?

I think I tried with patterns like [a,b,'_'] but that only finds [a,b,c]
and fails to get the others.

I suppose I could keep track of the longest keys inserted and expand
queries containing '*' into all permutations of a,b,'_','_',... up to
the maximum key length and execute multiple searches.  But that feels so
dirty.

Sincerely,
Eric
 


-----Original Message-----
From: Matthias [mailto:matthias@REDACTED] On Behalf Of Matthias
Lang
Sent: Saturday, January 04, 2003 2:47 AM
To: Eric Newhuis
Cc: erlang-questions@REDACTED
Subject: Re: Efficient Functional Data Structures


Eric Newhuis writes:
 > Is there any good information out there on efficient functional data
> structures?  

The classic is

    Chris Okasaki
    Purely Functional Data Structures
    Cambridge University Press; ISBN: 0521663504; 0 edition (July 1,
1999) 

He also has a lot of articles online at

    http://www.eecs.usma.edu/Personnel/okasaki/pubs.html#cup98

Reading the book was a bit of an "a-ha" experience for me. It clarified
the muddy ideas I had about functional data structures. Unfortunately
half the book is about lazy data structures.

 > I have two concerns.
 > 
 > The small concern:  Every time I add a child node I need to rebuild
the  > trie all the way back up to the root.  {orddict:store (Elem,
store  > (...), ...), ...}

This is a fundamental property of purely functional trees. You can only
get around it by introducing destructive update in one guise or another.
See also:

    http://www.research.avayalabs.com/user/wadler/topics/monads.html

 > Questions for the experts:  What can I do with "textbook" Erlang?
What  > can I do to cheat (possibly with a C "hack")?  Am I doing things
as good  > as I can with clean Erlang?  What can I do to speed things
up?

I haven't tried very hard to understanding your code, so I won't
comment.

My experience is that ETS is hard to beat as soon as you get even a
moderate amount of data (a few thousand nodes). Chances for beating ETS
improve if your data structure doesn't have a simple mapping to ETS or
if the data structure tends to be short-lived. Previous 
related discussions:

 
http://www.erlang.org/ml-archive/erlang-questions/200010/msg00193.html
 
http://www.erlang.org/ml-archive/erlang-questions/200012/msg00045.html

Matthias




More information about the erlang-questions mailing list