[erlang-questions] Mnesia question: DECISION_TAB.LOG and ETS 'mnesia_decision' sizes

Fri Feb 9 06:55:54 CET 2007

Hi, everyone.  I've been perplexed by the last several hours by a
feature of Mnesia...

... The application that I'm playing with is write-heavy.  Extremely
write-heavy.  Very small number of Mnesia tables.  More than 1K
transactions per second, with records ranging in size from 0.5-20KB
each.  Nothing really fancy for indexing.  Using mnesia_frag.  Roughly
1GB of test data total in all tables.  R10B-9.  Xeon EM64T boxes.
Linux CentOS 4.

Short story: 2 node, 4 node, and 8 node systems have same throughput
             as 2 node system.
(See below for longer story.)

I am very suspicious of the log dumping activity, the size of the ETS
table 'mnesia_decision', and the size of the DECISION_TAB.LOG file.

    8> length(ets:tab2list(mnesia_decision)).
    665533

# ls -Fs $prefix/var/data/Mnesia*/D*
59436 /usr/local/gemini/hss/var/data/Mnesia.hss1@REDACTED/DECISION_TAB.LOG

That's after I had stopped and restarted all 8 nodes.  Why is the
decision table so big?  Naive me, I assume that after all nodes have
started, any pending & interrupted & uncertain transactions have all
been figured out, aborted/committed, and life goes on with a clean
slate, right?

'mnesia_decision' and DECISION_TAB.LOG have similar but not identical
sizes on all 8 nodes.

I hacked a Perl script to tell me the age of each DECISION_TAB.LOG log
file on each node.  Under load, 90% of the time it's less than 4
seconds old.

Furthermore, "etop" shows that mnesia_controller:dump_and_reply/2 is a
*huge* time hog.  I'll attach some "etop" output at the end ... I made
it wider than its usual, just so I could **see** the long names.

Er, um ... {whimper} ... help?  If stopping & starting 1/some/all
nodes doesn't clear the mnesia_decision table, what will?

-Scott

--- snip --- snip --- snip --- snip --- snip --- snip ---

I'd been playing with a 2 node system, where each table has 2
replicas/copies.  I've got a baseline of performance with that system,
so now it's time to go to a 4-node system, using mnesia_frag, and each
fragment has 2 replicas/copies.  I aim my load generators at all 4
nodes, fire, ...

... and the throughput is roughly the same as the 2 node system.  Both
the 2 node and 4 node systems have each node eating 100% CPU time.
But throughput is flat.  No increase, just flat.

So I try an 8 node system.  Same result: 8 nodes eating 100% CPU time,
throughput is roughly the same as a 2 node system.  I stopped my head
banging, but my forehead still has QWERTY embossed in it.

--- snip --- snip --- snip --- snip --- snip --- snip ---

I have only experimented a little bit with a 2 node system with
R11B-3 nodes ... they showed no substantial improvement on application
throughput, though both scheduler threads consume a lot of CPU time.
I need to do more experimenting, though....

--- snip --- snip --- snip --- snip --- snip --- snip ---

========================================================================================
 'hss1@REDACTED'                                                             04:50:39
 Load:  cpu        49               Memory:  total      580593    binary     122157
        procs     175                        processes  181189    code         5345
        runq       24                        atom          527    ets        238741
Pid            Name or Initial Func    Time    Reds  Memory    MsgQ Current Function
----------------------------------------------------------------------------------------
<4139.8505.1>  mnesia_controller:dump_and_reply/2                   1831902   1269755 180548880         0 disk_log:monitor_request/2        
<4139.54.0>    mnesia_locker                                         101187     82850     55968        42 mnesia_locker:can_lock/4          
<4139.56.0>    mnesia_tm                                              92369     87708    145448        95 disk_log:monitor_request/2        
<4139.104.0>   proc_lib:init_p/5                                      59379     54031     68280        16 disk_log:loop/1                   
<4139.55.0>    mnesia_recover                                         51388     79740     68656        38 gen_server:handle_common_reply/6  
<4139.614.0>   pssql_protocol:new_session/4                           36499     19841     16760         0 mnesia_tm:rec/2                   
<4139.2009.0>  pssql_protocol:new_session/4                           24692     16669     17704         0 mnesia_tm:rec/2                   
<4139.5833.0>  pssql_protocol:new_session/4                           23328     17266     17680         0 mnesia_tm:rec/2                   
<4139.374.0>   cmcc_msgid_db_stats                                    23153     82959    513584         0 gen:wait_resp_mon/3               
<4139.1923.0>  pssql_protocol:new_session/4                           20902     15785     16720         0 prim_inet:recv0/3                 
========================================================================================

========================================================================================
 'hss1@REDACTED'                                                             04:50:44
 Load:  cpu        34               Memory:  total      518185    binary     121968
        procs     187                        processes  119142    code         5345
        runq        0                        atom          527    ets        238570
Pid            Name or Initial Func    Time    Reds  Memory    MsgQ Current Function
----------------------------------------------------------------------------------------
<4139.8869.1>  mnesia_controller:dump_and_reply/2                   1562904    711958 117133680         0 disk_log:monitor_request/2        
<4139.8935.1>  proc_lib:init_p/5                                     300661      1235      7744         0 prim_file:drv_get_response/1      
<4139.56.0>    mnesia_tm                                              99142    123879     88736         0 mnesia_tm:doit_loop/1             
<4139.54.0>    mnesia_locker                                          69454     96376     42152         0 mnesia_locker:loop/1              
<4139.55.0>    mnesia_recover                                         61112     83136     42192         0 gen_server:loop/6                 
<4139.104.0>   proc_lib:init_p/5                                      43921     92687     67824         0 disk_log:loop/1                   
<4139.374.0>   cmcc_msgid_db_stats                                    36493     84721    513584         0 io:wait_io_mon_reply/2            
<4139.2009.0>  pssql_protocol:new_session/4                           23298     20384     13704         0 prim_inet:recv0/3                 
<4139.859.0>   pssql_protocol:new_session/4                           19199     22988     11840         0 prim_inet:recv0/3                 
<4139.30267.0> pssql_protocol:new_session/4                           19038     18456     16760         0 mnesia_locker:receive_wlocks/4    
========================================================================================

========================================================================================
 'hss1@REDACTED'                                                             04:50:49
 Load:  cpu        41               Memory:  total      599778    binary     122033
        procs     177                        processes  200608    code         5345
        runq        0                        atom          527    ets        238631
Pid            Name or Initial Func    Time    Reds  Memory    MsgQ Current Function
----------------------------------------------------------------------------------------
<4139.9456.1>  mnesia_controller:dump_and_reply/2                   1678523   1111769 200609760         0 disk_log:monitor_request/2        
<4139.9573.1>  proc_lib:init_p/5                                     323167      1235      7744         0 prim_file:drv_get_response/1      
<4139.56.0>    mnesia_tm                                             120268    135773     68384         0 mnesia_tm:doit_loop/1             
<4139.54.0>    mnesia_locker                                          79628    129004     34256         0 mnesia_locker:loop/1              
<4139.104.0>   proc_lib:init_p/5                                      67524     54152     55048         0 disk_log:loop/1                   
<4139.55.0>    mnesia_recover                                         64354    117890     34296         0 gen_server:loop/6                 
<4139.859.0>   pssql_protocol:new_session/4                           49069     50448     16792         0 gen:wait_resp_mon/3               
<4139.676.0>   pssql_protocol:new_session/4                           43298     39918     11880         0 mnesia_tm:rec_all/4               
<4139.2009.0>  pssql_protocol:new_session/4                           40278     39381     17736         0 gen:wait_resp_mon/3               
<4139.606.0>   pssql_protocol:new_session/4                           30449     31238     16760         0 mnesia_tm:rec_all/4               
========================================================================================