mnesia usage optimisation

Luke Gorrie luke@REDACTED
Wed Jul 24 16:42:17 CEST 2002


Hi all,

I'm trying to optimise some code that uses Mnesia, I'd really
appreciate some advice :-)

It's for a program that uses Mnesia to store the edges of a directed
graph, the basic record being {From, To}. The table gets large, and
gets a lot of writes per transaction (e.g. a 50,000 element table
might be (re)built in one transaction). There are also a lot of edges
with the same 'From' value (each value is a small list or tuple).

I'm using Mnesia 3.10.2 on R7.

The following operations are needed, and they have to be fast:

  insert an element
  delete all elements with a particular From field
  lookup all elements with a particular To field

This module runs as part of a transaction, and it would be more
convenient to make it run fast in Mnesia than to use some other data
structure.

In the first approach, I use -record({myrec, {from, to}}) for my
schema (names changed to avoid distraction). The table is a ram-only
bag, and I have an extra index on 'to'.

The implementation is simple enough. Insert is a mnesia:write(),
delete by 'from' is a mnesia:delete(), and lookup by 'to' is a
mnesia:index_read().

The trouble is that the writes get slow when I have a lot of records
with the same 'from' value (the table key). It looks like it takes
linear time, and is way too slow (~10ms for a single write with ~5000
other records with the same key). The other operations are fine.

So I need to make it faster. Anyone have a good idea?

(The story continues..)

My next approach was to swap the order of the record fields, so that
'to' is the table key. This is because there tend not to be so many
duplicate 'to' values, so I can live with linear time on duplicates
(for now at least..)

With this approach, insert is a mnesia:write(), lookup by 'to' field
is a mnesia:read(), and delete by 'from' _would be_ a
mnesia:index_delete() - but that function doesn't exist. So for delete
I use:

  foreach({mnesia,delete_object},
          mnesia:index_read(myrec, From, #myrec.from))

.. which is very much too slow for deletions, although the insert does
become fast.

And that's where I'm up to now. Anyone with a solution will be awarded
a beer at EUC :-)

Cheers,
Luke




More information about the erlang-questions mailing list