multi-attribute mnesia indexes?

Fri Dec 29 15:37:10 CET 2000

Hello,

This mail does not contain an answer to your problems.  However, since I am planning to do research on the topic you
describe, I would appreciate it if you (or somebody else) would make your (or similar) source code publicly available.

I am very keen on looking at it (and I will of course inform you of performance bottlenecks and optimizations).

I am also interested in software projects in which large amounts of C code are used in combination with Erlang code.

As a third remark, I would like to mention that the installation of Erlang R7B-1 on RedHat Linux 7.0 was successful.
(My apologizes if this is not the right place to mention this).

Greetings,
Karel.

Shawn Pearce wrote:

> We're working on an application that probably should be using Oracle.
> However, the dataset is small enough that we should be able to use
> mnesia (100,000 rows in a table).  What we have run into is that we
> want to have 16 or so processes scanning the mnesia table, while another
> two are performing write transactions against it.
>
> First problem is that Mnesia is reporting its overloaded.  The exact
> console message is:
>
>         =ERROR REPORT==== 28-Dec-2000::23:55:46 ===
>         Mnesia('spearce@REDACTED'): ** ERROR ** Mnesia is overloaded: {dump_log, time_threshold}
>
> I dug in the archives and added these to my command line:
>
>         -mnesia dump_log_load_regulation false \
>         -mnesia dump_log_write_threshold 100000 \
>
> This cut back on the number of Mnesia error reports to one every few
> minutes, but they are still occuring.
>
> What the appliation is doing is, two generator processes are writing
> records into two mnesia tables, some 100,000 records at once.  Both
> processes are running in a tight loop, kind of like what you see below:
>
> mk(0) -> done;
> mk(X) ->
>         A = #foo{...},
>         B = #bar{...},
>         mnesia:transaction(fun() ->
>                 mnesia:write(A),
>                 mnesia:write(B)
>         end),
>         mk(X - 1).
>
> I started them by hand from the shell with:
>
>         spawn(mymod, mk, [50000]).
>         spawn(mymod, mk, [50000]).
>
> Rough calculation shows that mnesia is only doing 43 of these
> transactions per second with the system load such that it is.
>
> Now to add to the confusion, 16 other processes are running
> dirty_match_object operations against the tables at the same time the
> two generators are writing to them.  One of the 16 processes reads only
> one column in an index, so we use dirty_index_read.  The other 15 are
> busy with calls (many calls) to dirty_match_object.  The pattern used
> is the wild pattern for the table (9 attributes), with 5 of the
> attributes filled in with a value.  The other 4 were left alone.  (To
> be wild cards.)  None of these was the primary key (first attribute).
>
> Erlang uses 99% of the CPU to run this job.  Right now, its up at 70 MB
> of RAM, as the tables are all disk_copies tables (so they are cached
> in RAM).  Would switchig to disk_only tables help performance, getting
> rid of the cruft from RAM faster?  My machine has 256 MB of RAM free,
> so swapping is not occuring at the OS level.
>
> So.....
>
> 1) What can I do differently to prevent mnesia from whining about its
> log files?
>
> 2) Is there anything I can do to increase the performance of my match
> operation?  Would switching to mnemosyne help in this sitution?  Does
> mnesia support multi-attribute indexes which would speed up the
> performance of the match_object operation?
>
> At present, my only other option is to switch to a real SQL database,
> as I can get true multi-column indexes there.
>
> --
> Shawn.
>
>   ``If this had been a real
>     life, you would have
>     received instructions
>     on where to go and what
>     to do.''