Mnesia Speed Optimization

Younès HAFRI yhafri@REDACTED
Tue Jan 20 09:31:57 CET 2004


Hi All,
First thank you for your answers.

I want to share with you my last 3 days optimization and benchmarking
to speed up my database.
I've decide to comptely move my server from Mnesia table to a  pur ETS table
and I've decide to benchmark all "ets" fonction for reading a tuple.

The result was incredible ;-) .

COMPARISON OF 1 MILLION INSERTION
(of course, this results dependant hugely on the size of the inserted tuple)
My box config: PC 966Mhz, 512Mo, Suse8.2

1. Using ets:match() with LIMIT = 1:
Mnesia:dirty_read() 23.2 seconds ------------> ets:match() to 8.61 seconds.

2. Using ets:match_object () with LIMIT = 1 :
Mnesia:dirty_read() 23.2 seconds ------------> ets:match_object() to 
8.30 seconds.

3. Using ets:member () :
Mnesia:dirty_read() 23.2 seconds ------------> ets:match() to 4.21 seconds.

4. Using ets:member () :
Mnesia:dirty_read() 23.2 seconds ------------> ets:match() to 4.09 seconds.

This is the best function:

add_free(UrlList) ->
    lists:foreach(fun (Name) ->
                          case ets:lookup(url, Name) of
                              [] -> ets:insert(url, {Name, 0});
                              _  -> ok
                          end
                  end, UrlList),
    end.

After that, I set some options to a highest value :
 {min_heap_size, ?HEAP_SIZE}, % where ?HEAP_SIZE equals 2*MAXIMUM TUPLES 
(2 millions)
 {priority, high},
 {fullsweep_after, 40}

This give me a response under 4 seconds (3.81 seconds in average). If 
you set the ?HEAP_SIZE more than 2*MAXIMUM TUPLES,
this not change anything  (for me of course).

A few changes in the architecture offer me 80% of speed up. ;-)

Any other suggestions????

Regards,
Younès



Ulf Wiger wrote:

>
>
> Hello Younès,
>
> Your chances of receiving help will increase tremendously if
> you subscribe to the erlang-questions@REDACTED list. (:
>
> In this case, the linear behaviour is what can be expected at best.
> The thing you can do is to try to reduce the cost of each iteration.
> The cheapest way to check whether a record exists is to use
> ets:member/2. This will not work if the table is on disk only,
> or if there is no local copy. mnesia:dirty_read/1 always works.
>
> Example:
>
> (test@REDACTED)3> 
> mnesia:create_table(table_index,[{type,set},{ram_copies,[node()]},{attributes,[key,value]}]). 
>
> {atomic,ok}
> (test@REDACTED)4> [mnesia:dirty_write({table_index,K,value}) || K <- 
> lists:seq(1,10)].
> [ok,ok,ok,ok,ok,ok,ok,ok,ok,ok]
> (test@REDACTED)5> ets:member(table_index,5).
> true
> (test@REDACTED)6> ets:member(table_index,50).
> false
>
>
> You shouldn't under any circumstances use ets:insert/2 against a mnesia
> table, and never mix dirty_write/1 with transactions. Whether it's safe
> to use ets read operations and dirty_read/1 on a mnesia table depends
> on your application, but that's the optimization possibility I can
> see in this case.
>
> Your function then becomes:
>
> insert_if_not_exists(ListOfRecords) ->
>    lists:foreach(
>       fun(Record) ->
>          case ets:member(table_index, Record) of
>             true -> ok;
>             false ->
>                mnesia:dirty_write(...)
>          end
>       end, ListOfRecords).
>
> Regards,
> Ulf Wiger
>
>
> On Fri, 16 Jan 2004 13:55:03 +0100, Younès HAFRI <yhafri@REDACTED> wrote:
>
>> Hi,
>> I have a great optimization problem with Mnesia.
>> My database is not replicated and its works in the local node.
>> I want to write records if its not already exists in the database.
>>
>> For example:
>> insert_if_not_exists(aaaa) -> ok
>> insert_if_not_exists(bbbb) -> ok
>> insert_if_not_exists(cccc) -> ok
>> insert_if_not_exists(aaaa) -> no insertion (already exists)
>> insert_if_not_exists(dddd) -> ok
>> ....
>>
>> I've a huge list of records, 1 million exactlly. And this is my 
>> "insert if not exists" function:
>> ---------------------------------------------------------------------------- 
>> insert_if_not_exists(ListOf Records) ->
>>    lists:foreach(fun (Record) ->
>>                          case mnesia:dirty_read({table_index, 
>> Record}) of
>>                              []          -> 
>> mnesia:dirty_write(#table_index{field1= Record});
>>                              [Exists] -> ok
>>                          end
>>                  end, ListOf Records).
>> --------------------------------------------------------------------------- 
>>
>>
>> This is some results:
>> insertion of 10000      records take 0.02   seconds in average
>> insertion of 10000      records take 0.19   seconds in average
>> insertion of 100000    records take 2.27   seconds in average
>> insertion of 1 million records take 23.2   seconds in average
>>
>> The insertion time is linear but too much for 1 million records.
>>
>> Could you help me to speed up this function please??
>> Is there a method (parameters to set) to tune Mnesia.
>>
>> I need help really!!!
>> Thank you in advance
>>
>> Best Regards
>> Younès
>
>
>
>





More information about the erlang-questions mailing list