[erlang-questions] Mnesia and schema locks
Dan Gudmundsson
dangud@REDACTED
Tue Feb 20 15:16:00 CET 2018
On Tue, Feb 20, 2018 at 2:53 PM Loïc Hoguin <essen@REDACTED> wrote:
> Thanks, that helped a lot.
>
> What we ended up doing was call mnesia:set_debug_level(debug) and
> subscribe to system events and schema table events using
> mnesia:subscribe/1 and this gave us both the transaction/lock that keeps
> getting restarted and the transaction/lock that is the cause for this
> restart. We then inspected things in Observer and could get a very clear
> view of what is going on.
>
>
Great
> By the way is there a search function for finding a process in Observer?
> That would be useful to find the ones we are looking. :-)
>
>
Not yet, sounds useful, you can sort columns to ease the scrolling,
but no I have not received an PR on that yet :-)
> Cheers,
>
> On 02/14/2018 07:32 PM, Dan Gudmundsson wrote:
> > Well you will need to figure out what ,<6502.2299.18> <6502.2302.18> are
> > doing, but they probably waiting
> > for other locks which are occupied by the busy processes you wrote about.
> > But you will have to look at that, debugging mnesia is just following
> > the breadcrumbs around the system.
> >
> > mnesia_locker:get_held_locks() and mnesia_locker:get_lock_queue() may
> > also help.
> >
> > Using observer to attach to the different nodes is probably easiest,
> > then you can get a stacktrace of each process,
> > normally when I do it I don't have a live system. If I want to debug
> > post mortem I use mnesia_lib:dist_coredump()
> > to collect each mnesia nodes state and analyse them. Though with many
> > nodes it will take some time to debug or
> > figure out why it appears to be hanging.
> >
> >
> > On Wed, Feb 14, 2018 at 6:39 PM Loïc Hoguin <essen@REDACTED
> > <mailto:essen@REDACTED>> wrote:
> >
> > Hello,
> >
> > We are trying to debug an issue where we observe a lot of contention
> > when a RabbitMQ node go down. It has a number of symptoms and we are
> in
> > the middle of figuring things out.
> >
> > One particular symptom occurs on the node that restarts, it gets
> stuck
> > and there are two Mnesia locks:
> >
> > [{{schema,rabbit_durable_route},read,{tid,879886,<6502.2299.18>}},
> > {{schema,rabbit_exchange},read,{tid,879887,<6502.2302.18>}}]
> >
> > The locks are only cleared when the other node in the cluster stops
> > being so busy deleting data from a number of tables (another symptom)
> > and things go back to normal.
> >
> > Part of the problem is that while this is going on, the restarting
> node
> > cannot be used, so I would like to understand what conditions can
> result
> > in these locks staying up for so long. Any tips appreciated!
> >
> > Thanks in advance,
> >
> > --
> > Loïc Hoguin
> > https://ninenines.eu
> > _______________________________________________
> > erlang-questions mailing list
> > erlang-questions@REDACTED <mailto:erlang-questions@REDACTED>
> > http://erlang.org/mailman/listinfo/erlang-questions
> >
>
> --
> Loïc Hoguin
> https://ninenines.eu
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20180220/0729a4af/attachment.htm>
More information about the erlang-questions
mailing list