[erlang-questions] random lookup in ets

Pascal Chapier pascalchapier@REDACTED
Fri Aug 27 06:42:01 CEST 2010


  Hello Ulf, I agree with you, the code is sensible.

The funny (strange) things are the barriers I can have in my mind.

I take the opportunity of this reply to re-post my previous message 
using thunderbird, and check if the result is better.

Sorry for the annoyance.

######### repost #############

Hello

It is a funny usage of the previous function in ordered set! I would 
have bet that it produces an exception in case of
non existing key (like it does with simple set).
I tried it in a shell, and it works (almost as in your example).
##############################

1> T = ets:new(table,[ordered_set,public,named_table]).
table
2> ets:tab2list(table).
[]
3> ets:insert(table,[{1,"one"},{10,"ten"},{5,"five"},{50,"fifty"}]).
true
4> ets:tab2list(table).
[{1,"one"},{5,"five"},{10,"ten"},{50,"fifty"}]
5> ets:prev(table,10).
5
6> ets:prev(table,9).
5
7> ets:prev(table,1).
'$end_of_table'
8> ets:prev(table,1000000).
50
9> T1 = ets:new(table1,[set,public,named_table]).
table1
10> ets:insert(table1,[{1,"one"},{10,"ten"},{5,"five"},{50,"fifty"}]).
true
11> ets:tab2list(table1).
[{1,"one"},{10,"ten"},{50,"fifty"},{5,"five"}]
12> ets:prev(table1,10).
1
13> ets:prev(table1,1).
'$end_of_table'
14> ets:prev(table1,9).
** exception error: bad argument
in function ets:prev/2
called as ets:prev(table1,9)
15>

##############################
This method is quite easy to implement and should be fast,
but it does not guarantee that all keys have the same probability to be 
selected,
because the interval between keys is not a constant.
In my example, if you select a random value between 1 and 60 (included) 
you will get:
'$end_of_table'  -> 1.67%
1 -> 6.67%
5 -> 8.33%
10 -> 66.67%
50 -> 16.67%
§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§

I have though about the solution with a complementary index table. I 
could be interesting, if the initial table is formed by {PrimKey,Value} 
to build 2 tables:

primTable contents {PrimKey,Index,Value},
index contents {Index,PrimKey} (and is an ordered_set).
then:
add(NewPrimKey,NewValue) ->
                              NewIndex = ets:info(index,size) + 1,
                              ets:insert(index,{NewIndex,NewPrimKey}),
                              
ets:insert(primTable,{NewPrimKey,NewIndex,NewValue}).

delete(PrimKey) ->
[{PrimKey,Index,Value}] = ets:lookup(primTable,PrimKey),
                              % warning iterate in the list if primTable 
is a bag
delete(primTable,Primkey),
LastIndex = ets:info(index,size),
case (LastIndex == Index) of
false ->    [{LastIndex,LastKey}] = ets:lookup(index,LastIndex),
ets:insert(index,{Index,LastKey}),
[{LastKey,LastIndex,Value}] = ets:lookup(primTable,LastKey),
ets:insert(primTable,{LastKey,Index,Value});
true -> ok
end,
delete(index,LastIndex).

The order of elements in the tables changes, but it should have no 
impact on probabilities.
In case of parallel accesses to the table, the function must be 
serialized, and (may be) use ets:safe_fixtable (and the code should be 
debbugged, I didn't try it... :o)
§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§
The simplest solution
> > "second, you can use ets:tab2list(Ets), and ets:info(Ets,size), generate
> > X with random:uniform(Size): a random value between 1 and Size, and
> > finally take the Xth element of the list with lists:nth(X).."

can be accelerated, even if it will still be slow with big tables:
Size = ets:info(Ets,size),
X = random:uniform(Size),
ets:lookup(get(Ets,X)).

get(Ets,N) -> get(Ets,N-1,ets:first(Ets)).
get(_,0,K) -> K;

get(Ets,N,K) -> get(Ets,N-1,ets:next(Ets,K)).

CU.



More information about the erlang-questions mailing list