[erlang-questions] Index Overhead In Mnesia

Scott Lystig Fritchie fritchie@REDACTED
Thu Jun 12 01:32:19 CEST 2008


Sean Hinde <sean.hinde@REDACTED> wrote:

>> If so, is this a bug or a feature?

sh> It is a long standing "feature" of mnesia secondary indexes. It is
sh> one that I also ran into a few weeks into my use of Erlang/mnesia.

Yes, "feature" is a good word for it.

sh> I guess it is in the same category as repeated queue scanning for
sh> selective receive - easy to workaround in most cases, so no major
sh> incentive to change the behaviour.

At least one workaround has been mentioned elsewhere in this thread,
IIRC.  Another one is to change the term stored in each secondary
index'ed column from:

  term()                     ... where term() is too popular, e.g. 5

to something like:

  {term(), much_less_frequent_filler_term()}

... where the 2nd term is calculated by something like:

  {_, _, FillerNumber} = erlang:now()

Because the entire term is unique (or has very few duplicates), the
secondary index bag (or duplicate_bag, I forget) doesn't blow up with
O(N^2) behavior.

This causes problems such as:

* needing to use index_match_object() (e.g. with a pattern of {5, '_'})
instead of index_read() to fetch records matching 5.

* needing to use {DesiredTerm, '_'} in other places that you perform
queries, such as using QLC.

* perhaps others that I've forgotten?

-Scott



More information about the erlang-questions mailing list