[erlang-questions] High lock contention on dist_tables
Brian Picciano
mediocregopher@REDACTED
Fri Apr 19 21:24:12 CEST 2013
We have a pool of 3 erlang nodes, all on different servers. Every
afternoon, without fail, we start seeing lots of messages between the nodes
start having really high latency, on the order of tens of seconds. Today we
ran lcnt on them to see if there's anything there, and found that on one of
the nodes dist_tables had a significantly higher lock percentage then
anything else, and definitely higher then on the other boxes:
(node@REDACTED)8> lcnt:conflicts().
lock id #tries #collisions collisions [%] time
[us] duration [%]
----- --- ------- ------------ ---------------
---------- -------------
dist_table 1 3468191 1242055 35.8128
153712413 255.2521
run_queue 24 76969638 4088578 5.3119
14468656 24.0264
process_table 1 2015686 147148 7.3001
3208529 5.3280
timer_wheel 1 12214948 834737 6.8337
3076638 5.1090
timeofday 1 18231600 594487 3.2608
1491633 2.4770
...
while on the other boxes it had closer to 3. On the box with the high lock
contention we also saw much higher load then on the other boxes.
My question is: what is this lock? We couldn't find much online except that
it appears to have to do with communication between nodes, but we're not
sure what. Also, what, if anything, could we do to mitigate this problem?
(We're running erlang 16B)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20130419/461e5c29/attachment.htm>
More information about the erlang-questions
mailing list