[erlang-questions] random lookup in ets

Anthony Molinaro anthonym@REDACTED
Tue Aug 24 19:45:16 CEST 2010


On Tue, Aug 24, 2010 at 03:21:17PM +0200, Robert Virding wrote:
> On 23 August 2010 22:30, Anthony Molinaro <anthonym@REDACTED> wrote:
> > On Mon, Aug 23, 2010 at 07:05:40AM +0200, Pascal Chapier wrote:
> >> first, if your application manage the ets, you can add an new table as
> >> index, with an integer list as key, and the main ets key as value. then
> >> you will have to select a random integer between min and max index key,
> >> retreive the main ets key and then the value. The problem with this is
> >> that you will have to keep the index keys continous, it is not so easy
> >> and may takes some time when you make a delete operation. On the other
> >> hand it should be fast for random read.
> >
> > If you are keeping a secondary index you could do what Paul Mineiro
> > suggested I do with a similar problem.  In my problem I have non-overlapping
> > ranges of 32-bit integers.  For each range I have data associated with
> > the range.  Given an integer I wanted to look up the value associated
> > with the range.  I use an ordered_set ets table which contains tuples
> > of the form
> >
> > { StartOfRange, SecondaryIndexKey }
> >
> > Then given an integer I do
> >
> > { StartOfRange, SecondaryIndexKey } =
> >  case ets:lookup (Index, IntIp) of
> >    [] ->
> >      case ets:lookup (Index, ets:prev (Index, IntIp)) of
> >        [] -> {0, -1};
> >        V -> hd (V)
> >      end;
> >    V -> hd (V)
> >  end,
> >
> > To get back the SecondaryIndexKey (ets:prev will return the previous entry,
> > so using it when I don't exact match will give me the entry responsible
> > for the range).  This is extremely fast (I haven't tested it but something
> > like
> > [ lookup (random:uniform (trunc (math:pow (2,32))))
> >  || V <- lists:seq (1, 10000) ]
> >
> > where lookup/1 actually looks up the SecondaryIndexKey, gets the ets value
> > out of the secondary table (which is a large binary object, my
> > SecondaryIndexKey are actually offsets into a file), turns the returned
> > binary into a record where integers are turned into string via tuple
> > lookup tables, and returned.  So doing lookup 10000 times with random
> > input takes about ~950 milliseconds on my laptop, so very, very fast :)
> 
> This solutions assumes that ranges are adjacent to one another and
> that all possible indexes are a member of a range. Was this specified
> in the problem?

In my problem the ranges were adjacent and all ranges were represented
in the ets lookup table, I made sure of that when I created the ets table.

For the problem here I don't think it matters, since you don't care what
data you get back, just that its randomly choosen.

-Anthony

-- 
------------------------------------------------------------------------
Anthony Molinaro                           <anthonym@REDACTED>


More information about the erlang-questions mailing list