multi-attribute mnesia indexes?

Fri Dec 29 06:17:18 CET 2000

We're working on an application that probably should be using Oracle.
However, the dataset is small enough that we should be able to use
mnesia (100,000 rows in a table).  What we have run into is that we
want to have 16 or so processes scanning the mnesia table, while another
two are performing write transactions against it.

First problem is that Mnesia is reporting its overloaded.  The exact
console message is:

	=ERROR REPORT==== 28-Dec-2000::23:55:46 ===
	Mnesia('spearce@REDACTED'): ** ERROR ** Mnesia is overloaded: {dump_log, time_threshold}

I dug in the archives and added these to my command line:

        -mnesia dump_log_load_regulation false \
        -mnesia dump_log_write_threshold 100000 \

This cut back on the number of Mnesia error reports to one every few
minutes, but they are still occuring.

What the appliation is doing is, two generator processes are writing
records into two mnesia tables, some 100,000 records at once.  Both
processes are running in a tight loop, kind of like what you see below:

mk(0) -> done;
mk(X) ->
	A = #foo{...},
	B = #bar{...},
	mnesia:transaction(fun() ->
		mnesia:write(A),
		mnesia:write(B)
	end),
	mk(X - 1).

I started them by hand from the shell with:

	spawn(mymod, mk, [50000]).
	spawn(mymod, mk, [50000]).

Rough calculation shows that mnesia is only doing 43 of these
transactions per second with the system load such that it is.

Now to add to the confusion, 16 other processes are running
dirty_match_object operations against the tables at the same time the
two generators are writing to them.  One of the 16 processes reads only
one column in an index, so we use dirty_index_read.  The other 15 are
busy with calls (many calls) to dirty_match_object.  The pattern used
is the wild pattern for the table (9 attributes), with 5 of the
attributes filled in with a value.  The other 4 were left alone.  (To
be wild cards.)  None of these was the primary key (first attribute).

Erlang uses 99% of the CPU to run this job.  Right now, its up at 70 MB
of RAM, as the tables are all disk_copies tables (so they are cached
in RAM).  Would switchig to disk_only tables help performance, getting
rid of the cruft from RAM faster?  My machine has 256 MB of RAM free,
so swapping is not occuring at the OS level.

So.....

1) What can I do differently to prevent mnesia from whining about its
log files?

2) Is there anything I can do to increase the performance of my match
operation?  Would switching to mnemosyne help in this sitution?  Does
mnesia support multi-attribute indexes which would speed up the
performance of the match_object operation?

At present, my only other option is to switch to a real SQL database,
as I can get true multi-column indexes there.

--
Shawn.

  ``If this had been a real
    life, you would have
    received instructions
    on where to go and what
    to do.''