Mnesia and partitioning big tables: Doc, it hurts when...

Scott Lystig Fritchie fritchie@REDACTED
Tue Aug 29 18:59:25 CEST 2006


Howdy.  I don't know if any of you have tried partitioning a big
Mnesia table.  Oh, say, one with 800K entries in it.  "Doctor, it
hurts when I do [this]."  The runtime appears to be O(N^2).(*)  I
interrupted the process when I noticed:
    * The output of mnesia:info/0 would arrive on my remote shell at a
    rate of approx. 1 *line* every 15-20 seconds.
    * I noticed that it hadn't finished when I returned from supper.

This seems to be a symptom of a larger problem.  It seems like any
transaction that involves a large number of write/delete operations
also behaves in O(N^2) manner.  Looking at mnesia_frag.erl, it uses
the same mechanism as other write & delete operations.

The culprit appears to be the temporary ETS table
'mnesia_trans_store', used to store current transaction state.  It's a
'bag' table, and all (?) write & delete operations use the same key,
'op', for storage in that ETS table.  Sequential insertion of items
with the same key into a 'bag' table looks like the cause of the
pain.(***)

Is there any reason why 'mnesia_trans_store' is a 'bag'?  Aside from
being a convenient (but slow) way to store the write & delete & other
ops?

-Scott

(*) Erlang R10B-9 on an Linux/Opteron platform, HiPE not used for my
app or for Mnesia(**).  The table has not yet been fragmented.
mnesia:change_table_frag(Tab, {activate, []}) is fast.
mnesia:change_table_frag(Tab, {add_frag, [node()]}) is not.

(**) Is anyone running a HiPE-compiled Mnesia application?  I haven't
tried, yet.

(***) I have an excerpt of 'fprof' output from adding a fragment to an
800K entry table that already had 25 fragments; each fragment had
approx. 25K entries.  Distribution isn't even, so a few fragments have
about 50K entries.)

%% Analysis results:
{  analysis_options,
 [{callers, true},
  {sort, acc},
  {totals, true},
  {details, true}]}.

%                                               CNT       ACC       OWN        
[{ totals,                                     3686037,367339.496,343509.022}]. 
 %%%

{[{{mnesia_schema,schema_transaction,1},          1,349798.387,    0.000},      
  {{shell,eval_loop,3},                           2,17538.776,    0.000},      
  {{disk_log,monitor_request,2},                  2, 2406.145,    0.000},      
  {{mnesia_schema,do_insert_schema_ops,2},      326, 1139.758,    0.000},      
  {{dets,req,2},                                  2,  849.061,    0.000},      
  {{mnesia_dumper,insert,8},                   16542,  212.873,    0.000},      
  {{mnesia_index,del_ixes,4},                  1110,  105.762,    0.000},      
  {{mnesia_index,add_index2,6},                 913,   96.456,    0.000},      
  {{mnesia_lib,db_get,3},                       609,   82.711,    0.000},      
  {{mnesia_tm,commit_write,6},                  195,   81.309,    0.000},      
  {{ets,lookup_element,3},                      690,   80.703,    0.000},      
  {{ets,lookup,2},                              354,   75.630,    0.000},      
  {{ets,insert,2},                              359,   74.180,    0.000},      
  {{mnesia_tm,prepare_snmp,3},                  522,   72.567,    0.000},      
  {{mnesia_lib,db_get,2},                       815,   64.079,    0.000},      
  {{mnesia_schema,prepare_op,3},                439,   60.331,    0.000},      
  {{mnesia_schema,prepare_ops,6},               434,   59.457,    0.000},      
  {{mnesia_lib,val,1},                          539,   57.751,    0.000},      
  {{mnesia_index,delete_index2,3},              643,   51.354,    0.000},      
  {{mnesia_tm,do_update_op,3},                  361,   50.525,    0.000},      
  {{mnesia_frag_hash,key_to_frag_number,2},     484,   49.866,    0.000},      
  {{mnesia_index,db_put,2},                     255,   44.327,    0.000},      
  {{ets,match_delete,2},                        273,   44.318,    0.000},      
  {{disk_log_server,get_log_pids,1},            212,   44.299,    0.000},      
  {{mnesia_frag,do_split,5},                    575,   38.952,    0.000},      
  {{disk_log,notify,2},                         274,   37.963,    0.000},      
  {{mnesia_tm,val,1},                           339,   37.882,    0.000},      
  {{mnesia_dumper,disc_insert,8},               291,   32.388,    0.000},      
  {{mnesia_frag,key_to_n,2},                    286,   27.702,    0.000},      
  {{erlang,phash,2},                            221,   27.694,    0.000},      
  {{mnesia_tm,do_snmp,2},                        74,   25.488,    0.000},      
  {{erlang,'++',2},                              98,   24.193,    0.000},      
  {{ets,select_delete,2},                       136,   22.095,    0.000},      
  {{mnesia_index,add_index,5},                   84,   21.879,    0.000},      
  {{mnesia_frag,key_pos,1},                     101,   20.113,    0.000},      
  {{mnesia_dumper,insert_ops,6},                251,   19.165,    0.000},      
  {{mnesia_lib,db_put,3},                        97,   19.094,    0.000},      
  {{mnesia_tm,commit_delete,6},                 220,   19.020,    0.000},      
  {{math,pow,2},                                220,   16.810,    0.000},      
  {{mnesia_index,db_match_erase,2},             381,   12.915,    0.000},      
  {{disk_log,alog,2},                           241,   12.784,    0.000},      
  {{erlang,term_to_binary,1},                   101,   12.571,    0.000},      
  {{mnesia_schema,val,1},                       184,   11.679,    0.000},      
  {{gen,wait_resp_mon,3},                        17,    9.381,    0.000},      
  {{mnesia_dumper,insert_op,5},                 156,    6.492,    0.000},      
  {{mnesia_dumper,open_files,4},                124,    6.461,    0.000},      
  {{mnesia_log,append,2},                       156,    6.404,    0.000},      
  {{ets,delete,2},                               42,    6.379,    0.000},      
  {{mnesia_lib,db_erase,3},                      68,    6.350,    0.000},      
  {{mnesia_index,delete_index,3},                50,    6.304,    0.000},      
  {{ets,match_object,2},                         49,    0.049,    0.000},      
  {{mnesia_locker,receive_wlocks,4},              4,    0.005,    0.000},      
  {{prim_file,drv_get_response,1},                4,    0.004,    0.000},      
  {{mnesia_tm,rec,2},                             1,    0.004,    0.000},      
  {{mnesia_schema,schema_coordinator,3},          1,    0.004,    0.000},      
  {{filename,join1,4},                            3,    0.003,    0.000},      
  {{shell,used_records,3},                        1,    0.001,    0.000},      
  {{shell,prep_check,1},                          1,    0.001,    0.000},      
  {{mnesia_schema,rec2list,3},                    1,    0.001,    0.000},      
  {{mnesia_schema,list2cs,1},                     1,    0.001,    0.000},      
  {{mnesia_schema,do_set_schema,2},               1,    0.001,    0.000},      
  {{mnesia_schema,'-change_table_frag/2-fun-0-',2},   1,    0.001,    0.000},   
   
  {{lists,reverse,1},                             1,    0.001,    0.000},      
  {{lists,foldl,3},                               1,    0.001,    0.000},      
  {{ets,match,2},                                 1,    0.001,    0.000},      
  {{erl_eval,'-merge_bindings/2-fun-0-',2},       1,    0.001,    0.000},      
  {{dict,get_slot,2},                             1,    0.001,    0.000},      
  {{fprof,just_call,2},                           1,    0.000,    0.000}],     
 { suspend,                                    30943,373628.863,    0.000},     
%
 [ ]}.

....

{[{{mnesia_schema,do_insert_schema_ops,2},     50004,324510.758,324480.804},    
  
  {{mnesia_index,db_put,2},                    75003,  264.027,  226.191},      
  {{mnesia_lib,db_put,3},                      25001,  131.409,  125.020},      
  {{mnesia_lib,set,2},                          109,    9.958,    9.957},      
  {{mnesia_locker,get_wlocks_on_nodes,5},         4,    0.004,    0.004},      
  {{mnesia_schema,insert_cstruct,3},              3,    0.003,    0.003},      
  {{mnesia_locker,wlock,3},                       3,    0.003,    0.003},      
  {{mnesia_tm,multi_commit,4},                    1,    0.001,    0.001},      
  {{mnesia_recover,note_outcome,1},               1,    0.001,    0.001},      
  {{mnesia_recover,note_decision,2},              1,    0.001,    0.001}],     
 { {ets,insert,2},                             150130,324916.165,324841.985},   
  %
 [{suspend,                                     359,   74.180,    0.000}]}.    




More information about the erlang-questions mailing list