[erlang-questions] random lookup in ets
Robert Virding
rvirding@REDACTED
Tue Aug 24 15:21:17 CEST 2010
On 23 August 2010 22:30, Anthony Molinaro <anthonym@REDACTED> wrote:
> On Mon, Aug 23, 2010 at 07:05:40AM +0200, Pascal Chapier wrote:
>> first, if your application manage the ets, you can add an new table as
>> index, with an integer list as key, and the main ets key as value. then
>> you will have to select a random integer between min and max index key,
>> retreive the main ets key and then the value. The problem with this is
>> that you will have to keep the index keys continous, it is not so easy
>> and may takes some time when you make a delete operation. On the other
>> hand it should be fast for random read.
>
> If you are keeping a secondary index you could do what Paul Mineiro
> suggested I do with a similar problem. In my problem I have non-overlapping
> ranges of 32-bit integers. For each range I have data associated with
> the range. Given an integer I wanted to look up the value associated
> with the range. I use an ordered_set ets table which contains tuples
> of the form
>
> { StartOfRange, SecondaryIndexKey }
>
> Then given an integer I do
>
> { StartOfRange, SecondaryIndexKey } =
> case ets:lookup (Index, IntIp) of
> [] ->
> case ets:lookup (Index, ets:prev (Index, IntIp)) of
> [] -> {0, -1};
> V -> hd (V)
> end;
> V -> hd (V)
> end,
>
> To get back the SecondaryIndexKey (ets:prev will return the previous entry,
> so using it when I don't exact match will give me the entry responsible
> for the range). This is extremely fast (I haven't tested it but something
> like
> [ lookup (random:uniform (trunc (math:pow (2,32))))
> || V <- lists:seq (1, 10000) ]
>
> where lookup/1 actually looks up the SecondaryIndexKey, gets the ets value
> out of the secondary table (which is a large binary object, my
> SecondaryIndexKey are actually offsets into a file), turns the returned
> binary into a record where integers are turned into string via tuple
> lookup tables, and returned. So doing lookup 10000 times with random
> input takes about ~950 milliseconds on my laptop, so very, very fast :)
This solutions assumes that ranges are adjacent to one another and
that all possible indexes are a member of a range. Was this specified
in the problem?
Robert
More information about the erlang-questions
mailing list