[erlang-questions] random lookup in ets
Anthony Molinaro
anthonym@REDACTED
Tue Aug 24 19:45:16 CEST 2010
On Tue, Aug 24, 2010 at 03:21:17PM +0200, Robert Virding wrote:
> On 23 August 2010 22:30, Anthony Molinaro <anthonym@REDACTED> wrote:
> > On Mon, Aug 23, 2010 at 07:05:40AM +0200, Pascal Chapier wrote:
> >> first, if your application manage the ets, you can add an new table as
> >> index, with an integer list as key, and the main ets key as value. then
> >> you will have to select a random integer between min and max index key,
> >> retreive the main ets key and then the value. The problem with this is
> >> that you will have to keep the index keys continous, it is not so easy
> >> and may takes some time when you make a delete operation. On the other
> >> hand it should be fast for random read.
> >
> > If you are keeping a secondary index you could do what Paul Mineiro
> > suggested I do with a similar problem. In my problem I have non-overlapping
> > ranges of 32-bit integers. For each range I have data associated with
> > the range. Given an integer I wanted to look up the value associated
> > with the range. I use an ordered_set ets table which contains tuples
> > of the form
> >
> > { StartOfRange, SecondaryIndexKey }
> >
> > Then given an integer I do
> >
> > { StartOfRange, SecondaryIndexKey } =
> > case ets:lookup (Index, IntIp) of
> > [] ->
> > case ets:lookup (Index, ets:prev (Index, IntIp)) of
> > [] -> {0, -1};
> > V -> hd (V)
> > end;
> > V -> hd (V)
> > end,
> >
> > To get back the SecondaryIndexKey (ets:prev will return the previous entry,
> > so using it when I don't exact match will give me the entry responsible
> > for the range). This is extremely fast (I haven't tested it but something
> > like
> > [ lookup (random:uniform (trunc (math:pow (2,32))))
> > || V <- lists:seq (1, 10000) ]
> >
> > where lookup/1 actually looks up the SecondaryIndexKey, gets the ets value
> > out of the secondary table (which is a large binary object, my
> > SecondaryIndexKey are actually offsets into a file), turns the returned
> > binary into a record where integers are turned into string via tuple
> > lookup tables, and returned. So doing lookup 10000 times with random
> > input takes about ~950 milliseconds on my laptop, so very, very fast :)
>
> This solutions assumes that ranges are adjacent to one another and
> that all possible indexes are a member of a range. Was this specified
> in the problem?
In my problem the ranges were adjacent and all ranges were represented
in the ets lookup table, I made sure of that when I created the ets table.
For the problem here I don't think it matters, since you don't care what
data you get back, just that its randomly choosen.
-Anthony
--
------------------------------------------------------------------------
Anthony Molinaro <anthonym@REDACTED>
More information about the erlang-questions
mailing list