[erlang-questions] Mnesia question: DECISION_TAB.LOG and ETS 'mnesia_decision' sizes
Dan Gudmundsson
dgud@REDACTED
Fri Feb 9 11:58:48 CET 2007
Can you stop sending these Mnesia questions on Fridays... :-)
You don't use the old load_regulator switch do you?
If so remove it.
disc_copies or disc_only_copies?
The performance have actually increased a lot since before when dets was
used to store data on disk (for disc_copies), but the log dumping is
still one of the heaviest operations in mnesia if you keep writing data
it will eat alot of cpu cycles and io.
I'll have to look in to how DECISION_TAB.LOG works..but I don't think
that is real the problem.
What does your transactions do?
Are all nodes involved in your writes then it doesn't matter if you
scale it several nodes it just costs more.
If a transaction does 3 writes to tab A,B,C and in you 4 node case
A is on N1,N2, B is on N2,N3 and C is on N3,N4
all nodes needs the commit and the performance is lower in the
4 node case.
It's pretty tricky to get more perf out of mnesia_frag, you will
have to locate the parts so it fits your application and usage.
But it would be pretty interesting to see an eprof run of mnesia_dumper
to see if we can improve it.
Email me directly.
/Dan
Scott Lystig Fritchie wrote:
> Hi, everyone. I've been perplexed by the last several hours by a
> feature of Mnesia...
>
> ... The application that I'm playing with is write-heavy. Extremely
> write-heavy. Very small number of Mnesia tables. More than 1K
> transactions per second, with records ranging in size from 0.5-20KB
> each. Nothing really fancy for indexing. Using mnesia_frag. Roughly
> 1GB of test data total in all tables. R10B-9. Xeon EM64T boxes.
> Linux CentOS 4.
>
> Short story: 2 node, 4 node, and 8 node systems have same throughput
> as 2 node system.
> (See below for longer story.)
>
> I am very suspicious of the log dumping activity, the size of the ETS
> table 'mnesia_decision', and the size of the DECISION_TAB.LOG file.
>
> 8> length(ets:tab2list(mnesia_decision)).
> 665533
>
> # ls -Fs $prefix/var/data/Mnesia*/D*
> 59436 /usr/local/gemini/hss/var/data/Mnesia.hss1@REDACTED/DECISION_TAB.LOG
>
> That's after I had stopped and restarted all 8 nodes. Why is the
> decision table so big? Naive me, I assume that after all nodes have
> started, any pending & interrupted & uncertain transactions have all
> been figured out, aborted/committed, and life goes on with a clean
> slate, right?
>
> 'mnesia_decision' and DECISION_TAB.LOG have similar but not identical
> sizes on all 8 nodes.
>
> I hacked a Perl script to tell me the age of each DECISION_TAB.LOG log
> file on each node. Under load, 90% of the time it's less than 4
> seconds old.
>
> Furthermore, "etop" shows that mnesia_controller:dump_and_reply/2 is a
> *huge* time hog. I'll attach some "etop" output at the end ... I made
> it wider than its usual, just so I could **see** the long names.
>
> Er, um ... {whimper} ... help? If stopping & starting 1/some/all
> nodes doesn't clear the mnesia_decision table, what will?
>
> -Scott
>
> --- snip --- snip --- snip --- snip --- snip --- snip ---
>
> I'd been playing with a 2 node system, where each table has 2
> replicas/copies. I've got a baseline of performance with that system,
> so now it's time to go to a 4-node system, using mnesia_frag, and each
> fragment has 2 replicas/copies. I aim my load generators at all 4
> nodes, fire, ...
>
> ... and the throughput is roughly the same as the 2 node system. Both
> the 2 node and 4 node systems have each node eating 100% CPU time.
> But throughput is flat. No increase, just flat.
>
> So I try an 8 node system. Same result: 8 nodes eating 100% CPU time,
> throughput is roughly the same as a 2 node system. I stopped my head
> banging, but my forehead still has QWERTY embossed in it.
>
> --- snip --- snip --- snip --- snip --- snip --- snip ---
>
> I have only experimented a little bit with a 2 node system with
> R11B-3 nodes ... they showed no substantial improvement on application
> throughput, though both scheduler threads consume a lot of CPU time.
> I need to do more experimenting, though....
>
> --- snip --- snip --- snip --- snip --- snip --- snip ---
>
> ========================================================================================
> 'hss1@REDACTED' 04:50:39
> Load: cpu 49 Memory: total 580593 binary 122157
> procs 175 processes 181189 code 5345
> runq 24 atom 527 ets 238741
> Pid Name or Initial Func Time Reds Memory MsgQ Current Function
> ----------------------------------------------------------------------------------------
> <4139.8505.1> mnesia_controller:dump_and_reply/2 1831902 1269755 180548880 0 disk_log:monitor_request/2
> <4139.54.0> mnesia_locker 101187 82850 55968 42 mnesia_locker:can_lock/4
> <4139.56.0> mnesia_tm 92369 87708 145448 95 disk_log:monitor_request/2
> <4139.104.0> proc_lib:init_p/5 59379 54031 68280 16 disk_log:loop/1
> <4139.55.0> mnesia_recover 51388 79740 68656 38 gen_server:handle_common_reply/6
> <4139.614.0> pssql_protocol:new_session/4 36499 19841 16760 0 mnesia_tm:rec/2
> <4139.2009.0> pssql_protocol:new_session/4 24692 16669 17704 0 mnesia_tm:rec/2
> <4139.5833.0> pssql_protocol:new_session/4 23328 17266 17680 0 mnesia_tm:rec/2
> <4139.374.0> cmcc_msgid_db_stats 23153 82959 513584 0 gen:wait_resp_mon/3
> <4139.1923.0> pssql_protocol:new_session/4 20902 15785 16720 0 prim_inet:recv0/3
> ========================================================================================
>
> ========================================================================================
> 'hss1@REDACTED' 04:50:44
> Load: cpu 34 Memory: total 518185 binary 121968
> procs 187 processes 119142 code 5345
> runq 0 atom 527 ets 238570
> Pid Name or Initial Func Time Reds Memory MsgQ Current Function
> ----------------------------------------------------------------------------------------
> <4139.8869.1> mnesia_controller:dump_and_reply/2 1562904 711958 117133680 0 disk_log:monitor_request/2
> <4139.8935.1> proc_lib:init_p/5 300661 1235 7744 0 prim_file:drv_get_response/1
> <4139.56.0> mnesia_tm 99142 123879 88736 0 mnesia_tm:doit_loop/1
> <4139.54.0> mnesia_locker 69454 96376 42152 0 mnesia_locker:loop/1
> <4139.55.0> mnesia_recover 61112 83136 42192 0 gen_server:loop/6
> <4139.104.0> proc_lib:init_p/5 43921 92687 67824 0 disk_log:loop/1
> <4139.374.0> cmcc_msgid_db_stats 36493 84721 513584 0 io:wait_io_mon_reply/2
> <4139.2009.0> pssql_protocol:new_session/4 23298 20384 13704 0 prim_inet:recv0/3
> <4139.859.0> pssql_protocol:new_session/4 19199 22988 11840 0 prim_inet:recv0/3
> <4139.30267.0> pssql_protocol:new_session/4 19038 18456 16760 0 mnesia_locker:receive_wlocks/4
> ========================================================================================
>
> ========================================================================================
> 'hss1@REDACTED' 04:50:49
> Load: cpu 41 Memory: total 599778 binary 122033
> procs 177 processes 200608 code 5345
> runq 0 atom 527 ets 238631
> Pid Name or Initial Func Time Reds Memory MsgQ Current Function
> ----------------------------------------------------------------------------------------
> <4139.9456.1> mnesia_controller:dump_and_reply/2 1678523 1111769 200609760 0 disk_log:monitor_request/2
> <4139.9573.1> proc_lib:init_p/5 323167 1235 7744 0 prim_file:drv_get_response/1
> <4139.56.0> mnesia_tm 120268 135773 68384 0 mnesia_tm:doit_loop/1
> <4139.54.0> mnesia_locker 79628 129004 34256 0 mnesia_locker:loop/1
> <4139.104.0> proc_lib:init_p/5 67524 54152 55048 0 disk_log:loop/1
> <4139.55.0> mnesia_recover 64354 117890 34296 0 gen_server:loop/6
> <4139.859.0> pssql_protocol:new_session/4 49069 50448 16792 0 gen:wait_resp_mon/3
> <4139.676.0> pssql_protocol:new_session/4 43298 39918 11880 0 mnesia_tm:rec_all/4
> <4139.2009.0> pssql_protocol:new_session/4 40278 39381 17736 0 gen:wait_resp_mon/3
> <4139.606.0> pssql_protocol:new_session/4 30449 31238 16760 0 mnesia_tm:rec_all/4
> ========================================================================================
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://www.erlang.org/mailman/listinfo/erlang-questions
>
More information about the erlang-questions
mailing list