Mnesia and additional indexes: a cautionary tale
Scott Lystig Fritchie
fritchie@REDACTED
Thu Mar 30 02:00:02 CEST 2006
Greetings. I have a cautionary tale to tell about Mnesia and adding
an extra attribute index.
The story starts with panic (mine!) late last night. I was doing some
route performance tests for a Mnesia-based application: simple
1-attribute changes to single records in several tables. Updates for
one specific table were 2.5 *orders of magnitude* slower than all
others.
All of the tables were disc_copies tables. All contained 200K
entries. All fit quite comfortably in RAM without pissing off the
virtual memory system.
It was late, and I didn't want to struggle with remembering how to use
"fprof" or "eprof", so I used "cprof". IIRC, "cprof" can profile all
Erlang processes without lots of brain power or keystrokes. (It was
late, I was tired.) Cprof showed that about close to 2 orders of
magnitude fewer VM reductions being executed. Huh. That was not what
I wanted to see.
Go to sleep, wake up refreshed, then tackle the problem again.
Additional profiling is frustrated: no Erlang functions claim the
extra time. Perhaps I'm just inept at "fprof" subtlety, somehow
omitting the Erlang process that was eating all the CPU time? {shrug}
Later in the afternoon, I shutdown Mnesia, then restart it. My
application starts timing out at mnesia:wait_for_tables/2. So I start
mnesia manually, then go get coffee and make a phone call. When I
return 15 minutes later, Mnesia *still* hasn't finished starting up.
The "beam" process size should've been about 1,400KB with everything
loaded. But the process size was only 390MB, and "beam" was still
using 100% CPU time ... doing something, I dunno what!
So, I kill the VM and restart. Before starting Mnesia, I use
mnesia:set_debug_level(verbose). Sure enough, I see messages like:
Mnesia(pss1@REDACTED): Intend to load tables: [{'Tab1',local_only},
{'Tab2',local_only},
{'Tab3',local_only},
{'Tab4',local_only},
...
]
Mnesia(pss1@REDACTED): Mnesia started, 0 seconds
Mnesia(pss1@REDACTED): Creating index for 'Tab1'
Mnesia(pss1@REDACTED): Creating index for 'Tab2'
Mnesia(pss1@REDACTED): Creating index for 'Tab3'
Mnesia(pss1@REDACTED): Creating index for 'Tab3'
... and it hangs there, eating 100% CPU and getting no further.
A quick edit to mnesia_index.erl to include the attribute position
number shows me this instead:
Mnesia(pss1@REDACTED): Creating index for 'Tab1' Pos 3
Mnesia(pss1@REDACTED): Creating index for 'Tab2' Pos 7
Mnesia(pss1@REDACTED): Creating index for 'Tab3' Pos 3
Mnesia(pss1@REDACTED): Creating index for 'Tab3' Pos 5
Ah! Suddenly, it becomes very, very clear.
The table 'Tab3' contains 200K of debugging/development records. When
the code to create those records was first written, the attribute at
position #5 was a constant binary term.
Then "feature creep" happened, and an extra Mnesia index was created
for position #5. At the 200K records were added slowly, no one
noticed the performance impact using the exact same term for position
#5 ... until I did, last last night.
Moral of the story for Mnesia users (and other databases, I'm sure):
beware of the impact of adding secondary indexes.
For the Mnesia dev team, I have two questions:
1. That change to mnesia_index.erl is awfully handy ... though
unfortunately only handy when the Mnesia debug level is changed
from the default.
2. What are the odds that a future release could have less evil
behavior (less than O(N^2), taking a wild guess) with secondary
indexes like my (unfortunate, pilot error!) story?
-Scott
More information about the erlang-questions
mailing list