mnesia: inserting a large number of records

Hakan Mattsson <>
Mon Aug 1 15:50:48 CEST 2005

try mnesia:write_lock_table/1 first. It is simple
to use and reduces the locking overhead substantially.

I am attaching a simple test program, where the difference
turned out to be almost a factor 50 with 10000 records a
~4000 bytes. But you should really measure on your own
hardware with real data. 

Vance, did you try table locks? How much did you gain
with your solution?


On Thu, 28 Jul 2005, Sebastian Bello wrote:

SB> Date: Thu, 28 Jul 2005 17:51:29 -0300
SB> From: Sebastian Bello <>
SB> To: 
SB> Subject: Re: mnesia: inserting a large number of records
SB> Vance,
SB> thank you very much for your response and your detailed description. The
SB> solution looks a bit complicated to me, but I'll try some tests.
SB> Thanks!
SB>  Sebastian-
SB> ----- Original Message -----
SB> From: "Vance Shipley" <>
SB> To: "Sebastian Bello" <>
SB> Cc: <>
SB> Sent: Wednesday, July 27, 2005 5:24 AM
SB> Subject: Re: mnesia: inserting a large number of records
SB> > On Fri, Jul 22, 2005 at 10:29:36AM -0300, Sebastian Bello wrote:
SB> > }
SB> > } a programm reads records from a text file and inserts them in
SB> > } a mnesia table. We are performing this insertions within a
SB> > } transaction so in case of an error the whole file can be
SB> > } reprocessed. The file holds approx. 5.000-10.000 records.
SB> > } It seems the transaction time is not linear; I'm wondering if
SB> > } there is a faster way to perform the insertions, maybe using
SB> > } a table lock, I don't know. Any suggestions?
SB> >
SB> > Sebastian,
SB> >
SB> > I had a similiar challenge where we wanted to import large text
SB> > files into a distributed mnesia database while it was in production.
SB> > In our case we mostly needed to replace the existing copy so I
SB> > came up with the following scheme:
SB> >
SB> >    - create a new ram based table (e.g. foo_import)
SB> >    - use a write lock transaction fun with mnesia:ets/1 to
SB> >      insert records
SB> >    - use mnesia:change_table_copy_type/3 to change it to a
SB> >      disc based table on the local node only
SB> >    - activate a check point on this table table
SB> >    - backup this checkpoint using a custom mnesia_backup
SB> >      behaviour callback module to change the records on
SB> >      the fly to use the real table name (e.g. #foo_import{}
SB> >      to #foo{}).
SB> >
SB> > The idea is that you create the table in an ets context without
SB> > lock overheads so that it is a fast operation (i.e. the user doesn't
SB> > wait long) and then write it out to a binary backup file on disk.
SB> >
SB> > Now the user may use mnesia:restore/2 to replace the working
SB> > table with the backup.  You can do this while the system is running
SB> > and transactions will block while it replaces the table.  In our
SB> > experience a couple seconds at worst.  As I said we just replace
SB> > the current table but you could just as easily insert the records
SB> > into the existing table using the keep_tables option.  I haven't
SB> > tried this scheme so I can't say how it performs.  For our purposes
SB> > we changed the time it took to perform the import from many minutes,
SB> > if not hours, to maybe twenty seconds.  Aftet that as I said the
SB> > table can be replaced in a couple seconds.
SB> >
SB> > -Vance
-------------- next part --------------


-record(t, {key, val}).

go() ->
    Tab = t,
    Storage = disc_copies,
    Nodes = [node() | nodes()],
    N = 10000,
    Bulk = term_to_binary(lists:seq(1, 1000)),
    init(Tab, Storage, Nodes),
    io:format("Import ~p records a ~p bytes into ~p ~p\n",
	      [N, size(Bulk), length(Nodes), Storage]),
    TabTime = run(Tab, N, Bulk, table),
    RecTime = run(Tab, N, Bulk, record),
    io:format("Table lock: ~p seconds\n", [TabTime / 1000000]),
    io:format("Record lock: ~p seconds\n", [RecTime / 1000000]),
    io:format("~p times improvement\n", [RecTime / TabTime]).
run(Tab, N, Bulk, LockType) ->
    Fun = fun() ->
		  case LockType of
		      table ->
		      record ->
		  import(N, Bulk),
    {atomic,ok} = mnesia:clear_table(Tab),
    {Time, {atomic,ok}} = timer:tc(mnesia, sync_transaction, [Fun]),
init(Tab, Storage, Nodes) ->
    rpc:multicall(Nodes, mnesia, stop, []),
    ok = mnesia:create_schema(Nodes),
    rpc:multicall(Nodes, mnesia, start, []),
    TabDef = [{Storage, Nodes}, {record_name, Tab}],
    {atomic,ok} = mnesia:create_table(Tab, TabDef).

import(0, _Bulk) ->
import(N, Bulk) ->
    mnesia:write(t, #t{key = N, val = Bulk}, write),
    import(N - 1, Bulk).


More information about the erlang-questions mailing list