[erlang-questions] mnesia lock, dead-lock or what?

Igor Goryachev igor@REDACTED
Fri Nov 14 13:26:24 CET 2008


Hello, everybody!

We run a high-loaded cluster of several ejabberd (with some
proprietary modules which do not use mnesia) nodes. The half of nodes is
placed in one data center, the rest is in the another one. Some times
(may be during network failures) the cluster suffers of locks inside
mnesia subsystem which causes service degradation at all. I have already
checked the code inside of mnesia transactions, for me -- it's
fine. Well, we use R12B-3 (amd64, Debian GNU/Linux, package was
rebuilded w/o smp and async-threads) on all our machines and here is the
output of mnesia:info/0 during the deal ("myserver" and "otherserver\d+"
are renamed real hosts for the security purposes):

(ejabberd@REDACTED)2> mnesia:info().
---> Processes holding locks <---
Lock: {{s2s,{"myserver","otherserver1"}},
       write, 
       {tid,370791,<4079.5909.10>}}
Lock: {{s2s,{"myserver","otherserver2"}},
       write, 
       {tid,370784,<4077.8264.8>}}
Lock: {{s2s,{"myserver","otherserver3"}},
       read,  
       {tid,370787,<3910.21053.1>}}
Lock: {{s2s,{"myserver","otherserver3"}},
       write, 
       {tid,370787,<3910.21053.1>}}
Lock: {{s2s,{"myserver","otherserver4"}},write,{tid,370791,<4077.4651.0>}}
Lock: {{s2s,{"myserver","otherserver5"}},
       write, 
       {tid,370780,<4080.959.9>}}
Lock: {{s2s,{"myserver","otherserver6"}},
       write, 
       {tid,370790,<4080.27328.7>}}
Lock: {{s2s,{"myserver","otherserver7"}},
       write, 
       {tid,370790,<4078.29634.7>}}
Lock: {{s2s,{"myserver","otherserver8"}},
       write, 
       {tid,370790,<4077.5830.9>}}
Lock: {{s2s,{"myserver","otherserver9"}},
       write, 
       {tid,370790,<4079.25514.8>}}
---> Processes waiting for locks <---
---> Participant transactions <---
Tid: 370780 (owned by <4080.959.9>)
with participant objects {commit,ejabberd@REDACTED,presume_commit,
                             [{{s2s,{"myserver","otherserver5"}},
                               {s2s,
                                   {"myserver","otherserver5"},
                                   <4080.959.9>,"551643360"},
                               write}],
                             [],[],[],[]}
Tid: 370784 (owned by <4077.8264.8>)
with participant objects {commit,ejabberd@REDACTED,presume_commit,
                             [{{s2s,{"myserver","otherserver2"}},
                               {s2s,
                                   {"myserver","otherserver2"},
                                   <4077.8264.8>,"2852318864"},
                               delete_object}],
                             [],[],[],[]}
Tid: 370790 (owned by <4080.27328.7>)
with participant objects {commit,ejabberd@REDACTED,presume_commit,
                             [{{s2s,{"myserver","otherserver6"}},
                               {s2s,
                                   {"myserver","otherserver6"},
                                   <4080.27328.7>,"860186058"},
                               delete_object}],
                             [],[],[],[]}
Tid: 370790 (owned by <4078.29634.7>)
with participant objects {commit,ejabberd@REDACTED,presume_commit,
                             [{{s2s,{"myserver","otherserver7"}},
                               {s2s,
                                   {"myserver","otherserver7"},
                                   <4078.29634.7>,"1948941028"},
                               delete_object}],
                             [],[],[],[]}
Tid: 370790 (owned by <4079.25514.8>)
with participant objects {commit,ejabberd@REDACTED,presume_commit,
                             [{{s2s,{"myserver","otherserver9"}},
                               {s2s,
                                   {"myserver","otherserver9"},
                                   <4079.25514.8>,"325351070"},
                               delete_object}],
                             [],[],[],[]}
Tid: 370790 (owned by <4077.5830.9>)
with participant objects {commit,ejabberd@REDACTED,presume_commit,
                             [{{s2s,{"myserver","otherserver8"}},
                               {s2s,
                                   {"myserver","otherserver8"},
                                   <4077.5830.9>,"1699221599"},
                               delete_object}],
                             [],[],[],[]}
---> Coordinator transactions <---
Tid: 370787 (owned by <3910.21053.1>)
Tid: 370788 (owned by <3910.32551.1>)
.......

What does it mean? Why does it occur? How could we resolve this
behaviour? Is there enough information for the investigation?

Thank you very much for the attention.


-- 
    Igor Goryachev              E-Mail/Jabber: igor@REDACTED



More information about the erlang-questions mailing list