[erlang-questions] Mnesia deadlock with large volume of dirty operations?

Ovidiu Deac <>
Fri Apr 2 14:47:04 CEST 2010


To me it sounds like another example of premature optimization which
went wrong? :)

On Fri, Apr 2, 2010 at 10:19 AM, Dan Gudmundsson <> wrote:
> When you are using dirty, every operation is sent separately to all nodes,
> i.e. 192593*6 messages, actually a transaction could have been faster
> in this case.
> With one message (large) containing all ops to each node.
>
> What you get is an overloaded mnesia_tm (very long msg queues),
> which do the actual writing of the data on the other (participating
> mnesia nodes).
>
> So transactions will be blocked waiting on mnesia_tm to process those 200000
> messages on the other nodes.
>
> /Dan
>
> On Fri, Apr 2, 2010 at 1:11 AM, Brian Acton <> wrote:
>> Hi guys,
>>
>> I am running R13B04 SMP on FreeBSD 7.3. I have a cluster of 7 nodes running
>> mnesia.
>>
>> I have a table of 1196143 records using about 1.504GB of storage. It's a
>> reasonably hot table doing a fair number of insert operations at any given
>> time.
>>
>> I decided that since there was a 2GB limit in mnesia that I should do some
>> cleanup on the system and specifically this table.
>>
>> Trying to avoid major problems with Mnesia, transaction load, and deadlock,
>> I decided to do dirty_select and dirty_delete_object individually on the
>> records.
>>
>> I started slow, deleting first 10, then 100, then 1000, then 10000, then
>> 100,000 records. My goal was to delete 192593 records total.
>>
>> The first five deletions went through nicely and caused minimal to no
>> impact.
>>
>> Unfortunately, the very last delete blew up the system. My delete command
>> completed successfully but on the other nodes, it caused mnesia to get stuck
>> on pending transactions, caused my message queues to fill up and basically
>> brought down the whole system. We saw some mnesia is overloaded messages in
>> our logs on these nodes but did not see a ton of them.
>>
>> Does anyone have any clues on what went wrong? I am attaching my code below
>> for your review.
>>
>> --b
>>
>> Mnesia configuration tunables:
>>
>>      -mnesia no_table_loaders 20
>>      -mnesia dc_dump_limit 40
>>      -mnesia dump_log_write_threshold 10000
>>
>> Example error message:
>>
>> ** WARNING ** Mnesia is overloaded: {mnesia_tm, message_queue_len,
>> [387,842]}
>>
>> Sample code:
>>
>> Select = fun(Days) ->
>>         {MegaSecs, Secs, _MicroSecs} = now(),
>>         T = MegaSecs * 1000000 + Secs - 86400 * Days,
>>         TimeStamp = {T div 1000000, T rem 1000000, 0},
>>         mnesia:dirty_select(offline_msg,
>>                     [{'$1',
>>                       [{'<', {element, 3, '$1'},
>>                     {TimeStamp} }],
>>                       ['$1']}])
>>     end.
>>
>> Count = fun(Days) -> length(Select(Days)) end.
>>
>> Delete = fun(Days, Total) ->
>>         C = Select(Days),
>>         D = lists:sublist(C, Total),
>>         lists:foreach(fun(Rec) ->
>>                       ok = mnesia:dirty_delete_object(Rec)
>>                   end,
>>                   D),
>>         length(D)
>>     end.
>>
>
> ________________________________________________________________
> erlang-questions (at) erlang.org mailing list.
> See http://www.erlang.org/faq.html
> To unsubscribe; mailto:
>
>


More information about the erlang-questions mailing list