Mnesia and additional indexes: a cautionary tale

Thu Mar 30 09:44:47 CEST 2006

Mnesia indecies are implemented with an additional [d]ets _BAG_ table per index,
which have the secondary index as a key and the value is the key in the real table.

Insertion time in ets bag tables are linear, and have to be that way mnesia 
relies on the insertion order (in other parts).

You are not the first person to have made that mistake I can assure you :-) 

The others have most often done it on test systems, though, then they 
come and complain about mnesia's lousy insertion performance..
Maybe I should add something more about it in the manual...

/Dan

Scott Lystig Fritchie writes:
 > Greetings.  I have a cautionary tale to tell about Mnesia and adding
 > an extra attribute index.
 > 
 > The story starts with panic (mine!) late last night.  I was doing some
 > route performance tests for a Mnesia-based application: simple
 > 1-attribute changes to single records in several tables.  Updates for
 > one specific table were 2.5 *orders of magnitude* slower than all
 > others.
 > 
 > All of the tables were disc_copies tables.  All contained 200K
 > entries.  All fit quite comfortably in RAM without pissing off the
 > virtual memory system.
 > 
 > It was late, and I didn't want to struggle with remembering how to use
 > "fprof" or "eprof", so I used "cprof".  IIRC, "cprof" can profile all
 > Erlang processes without lots of brain power or keystrokes.  (It was
 > late, I was tired.)  Cprof showed that about close to 2 orders of
 > magnitude fewer VM reductions being executed.  Huh.  That was not what
 > I wanted to see.
 > 
 > Go to sleep, wake up refreshed, then tackle the problem again.
 > Additional profiling is frustrated: no Erlang functions claim the
 > extra time.  Perhaps I'm just inept at "fprof" subtlety, somehow
 > omitting the Erlang process that was eating all the CPU time?  {shrug}
 > 
 > Later in the afternoon, I shutdown Mnesia, then restart it.  My
 > application starts timing out at mnesia:wait_for_tables/2.  So I start
 > mnesia manually, then go get coffee and make a phone call.  When I
 > return 15 minutes later, Mnesia *still* hasn't finished starting up.
 > 
 > The "beam" process size should've been about 1,400KB with everything
 > loaded.  But the process size was only 390MB, and "beam" was still
 > using 100% CPU time ... doing something, I dunno what!
 > 
 > So, I kill the VM and restart.  Before starting Mnesia, I use
 > mnesia:set_debug_level(verbose).  Sure enough, I see messages like:
 > 
 >     Mnesia(pss1@REDACTED): Intend to load tables: [{'Tab1',local_only},
 >                                                 {'Tab2',local_only},
 >                                                 {'Tab3',local_only},
 >                                                 {'Tab4',local_only},
 >     					    ...
 >     					   ]
 >     Mnesia(pss1@REDACTED): Mnesia started, 0 seconds
 >     Mnesia(pss1@REDACTED): Creating index for 'Tab1' 
 >     Mnesia(pss1@REDACTED): Creating index for 'Tab2' 
 >     Mnesia(pss1@REDACTED): Creating index for 'Tab3' 
 >     Mnesia(pss1@REDACTED): Creating index for 'Tab3' 
 > 
 > ... and it hangs there, eating 100% CPU and getting no further.
 > 
 > A quick edit to mnesia_index.erl to include the attribute position
 > number shows me this instead:
 > 
 >     Mnesia(pss1@REDACTED): Creating index for 'Tab1' Pos 3
 >     Mnesia(pss1@REDACTED): Creating index for 'Tab2' Pos 7
 >     Mnesia(pss1@REDACTED): Creating index for 'Tab3' Pos 3
 >     Mnesia(pss1@REDACTED): Creating index for 'Tab3' Pos 5
 > 
 > Ah!  Suddenly, it becomes very, very clear.
 > 
 > The table 'Tab3' contains 200K of debugging/development records.  When
 > the code to create those records was first written, the attribute at
 > position #5 was a constant binary term.
 > 
 > Then "feature creep" happened, and an extra Mnesia index was created
 > for position #5.  At the 200K records were added slowly, no one
 > noticed the performance impact using the exact same term for position
 > #5 ... until I did, last last night.
 > 
 > Moral of the story for Mnesia users (and other databases, I'm sure):
 > beware of the impact of adding secondary indexes.
 > 
 > For the Mnesia dev team, I have two questions:
 > 
 > 1. That change to mnesia_index.erl is awfully handy ... though
 >    unfortunately only handy when the Mnesia debug level is changed
 >    from the default.
 > 
 > 2. What are the odds that a future release could have less evil
 >    behavior (less than O(N^2), taking a wild guess) with secondary
 >    indexes like my (unfortunate, pilot error!) story?
 > 
 > -Scott