[erlang-questions] random entry from mensia

Jesper Louis Andersen jesper.louis.andersen@REDACTED
Thu Nov 18 19:11:26 CET 2010


On Thu, Nov 18, 2010 at 6:00 PM, James Churchman
<jameschurchman@REDACTED> wrote:
> Someone i know brought up an interesting question that is usually very easy to solve in sql but i could not see an obvious efficient solution in Mnesia

The same problem is present in most other K/V stores where you can not
rely on the fact that the K is an increasing value. My solution would
probably be to have another table mapping {I, K} where I is a running
index. The exercise left for the reader is to handle the case where K
is deleted :)

>
> There is the option in both ETS and Mnesia called dirty_slot, which i thought gets an item by its numeric order.

No, it doesn't.

ETS is, for some ETS types, implemented as a hash-table. This means
that the table contains buckets, some of which are full, and some of
which are empty. the slot/2 calls retrieves the contents of the
buckets in the hash-table so depending on load-factor of the table you
will amost always get a result, almost never get a result or something
in between. Leave slot for the debugging it was meant to do. For
example, an empty table use a 256 bucket has table. If you insert a
single element into this table, one of the bucket will be filled, but
the remaining 255 buckets are not, yet, with any data. It does suggest
a crude algorithm however:

1. Pick a number, i, randomly between 0 and BUCKET_SIZE (i.e., 0.256 above)
2. Search i, i+1, i+2, i+.., until you hit a non-empty bucket. Pick
randomly from this bucket.
3. If you hit $end_of_table continue the search from 0.

Depending on the randomness properties of the hash-table this is
either really really bad or adequate. I'd hesitate to call the
distribution good though as I am afraid something will go very wrong.
Also, the run-time is potentially much greater than what you expect
but you will hopefully hit something.

-- 
J.


More information about the erlang-questions mailing list