[erlang-questions] Mnesia deadlock with large volume of dirty operations?
Dan Gudmundsson
dgud@REDACTED
Fri Apr 2 09:19:10 CEST 2010
When you are using dirty, every operation is sent separately to all nodes,
i.e. 192593*6 messages, actually a transaction could have been faster
in this case.
With one message (large) containing all ops to each node.
What you get is an overloaded mnesia_tm (very long msg queues),
which do the actual writing of the data on the other (participating
mnesia nodes).
So transactions will be blocked waiting on mnesia_tm to process those 200000
messages on the other nodes.
/Dan
On Fri, Apr 2, 2010 at 1:11 AM, Brian Acton <acton@REDACTED> wrote:
> Hi guys,
>
> I am running R13B04 SMP on FreeBSD 7.3. I have a cluster of 7 nodes running
> mnesia.
>
> I have a table of 1196143 records using about 1.504GB of storage. It's a
> reasonably hot table doing a fair number of insert operations at any given
> time.
>
> I decided that since there was a 2GB limit in mnesia that I should do some
> cleanup on the system and specifically this table.
>
> Trying to avoid major problems with Mnesia, transaction load, and deadlock,
> I decided to do dirty_select and dirty_delete_object individually on the
> records.
>
> I started slow, deleting first 10, then 100, then 1000, then 10000, then
> 100,000 records. My goal was to delete 192593 records total.
>
> The first five deletions went through nicely and caused minimal to no
> impact.
>
> Unfortunately, the very last delete blew up the system. My delete command
> completed successfully but on the other nodes, it caused mnesia to get stuck
> on pending transactions, caused my message queues to fill up and basically
> brought down the whole system. We saw some mnesia is overloaded messages in
> our logs on these nodes but did not see a ton of them.
>
> Does anyone have any clues on what went wrong? I am attaching my code below
> for your review.
>
> --b
>
> Mnesia configuration tunables:
>
> -mnesia no_table_loaders 20
> -mnesia dc_dump_limit 40
> -mnesia dump_log_write_threshold 10000
>
> Example error message:
>
> ** WARNING ** Mnesia is overloaded: {mnesia_tm, message_queue_len,
> [387,842]}
>
> Sample code:
>
> Select = fun(Days) ->
> {MegaSecs, Secs, _MicroSecs} = now(),
> T = MegaSecs * 1000000 + Secs - 86400 * Days,
> TimeStamp = {T div 1000000, T rem 1000000, 0},
> mnesia:dirty_select(offline_msg,
> [{'$1',
> [{'<', {element, 3, '$1'},
> {TimeStamp} }],
> ['$1']}])
> end.
>
> Count = fun(Days) -> length(Select(Days)) end.
>
> Delete = fun(Days, Total) ->
> C = Select(Days),
> D = lists:sublist(C, Total),
> lists:foreach(fun(Rec) ->
> ok = mnesia:dirty_delete_object(Rec)
> end,
> D),
> length(D)
> end.
>
More information about the erlang-questions
mailing list