[erlang-questions] High lock contention on dist_tables
Brian Picciano
mediocregopher@REDACTED
Mon Apr 22 22:23:48 CEST 2013
We are actually. Is there an alternative way of easily retrieving which
nodes are currently connected?
On Mon, Apr 22, 2013 at 12:06 PM, Lukas Larsson
<lukas@REDACTED>wrote:
> The dist_table mutex refers to the rwmutex which is defined here[1]. There
> is a bunch of different places where it is used, so saying exactly what is
> causing the contentions is hard without knowing the code. Generally it
> should indicate that you are trying to send many messages over distribution
> while information about remote nodes is changing frequently.
>
> One thing I noticed is that the nodes() bif call takes a rwlock on the
> mutex. Are you using that bif alot?
>
> Lukas
>
> [1]:
> https://github.com/erlang/otp/blob/maint/erts/emulator/beam/erl_node_tables.c#L802
>
>
> On Fri, Apr 19, 2013 at 9:24 PM, Brian Picciano <mediocregopher@REDACTED>wrote:
>
>> We have a pool of 3 erlang nodes, all on different servers. Every
>> afternoon, without fail, we start seeing lots of messages between the nodes
>> start having really high latency, on the order of tens of seconds. Today we
>> ran lcnt on them to see if there's anything there, and found that on one of
>> the nodes dist_tables had a significantly higher lock percentage then
>> anything else, and definitely higher then on the other boxes:
>>
>> (node@REDACTED)8> lcnt:conflicts().
>>
>> lock id #tries #collisions collisions [%] time
>> [us] duration [%]
>> ----- --- ------- ------------ ---------------
>> ---------- -------------
>> dist_table 1 3468191 1242055 35.8128
>> 153712413 255.2521
>> run_queue 24 76969638 4088578 5.3119
>> 14468656 24.0264
>> process_table 1 2015686 147148 7.3001
>> 3208529 5.3280
>> timer_wheel 1 12214948 834737 6.8337
>> 3076638 5.1090
>> timeofday 1 18231600 594487 3.2608
>> 1491633 2.4770
>> ...
>>
>> while on the other boxes it had closer to 3. On the box with the high
>> lock contention we also saw much higher load then on the other boxes.
>>
>> My question is: what is this lock? We couldn't find much online except
>> that it appears to have to do with communication between nodes, but we're
>> not sure what. Also, what, if anything, could we do to mitigate this
>> problem?
>>
>> (We're running erlang 16B)
>>
>> _______________________________________________
>> erlang-questions mailing list
>> erlang-questions@REDACTED
>> http://erlang.org/mailman/listinfo/erlang-questions
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20130422/f8c8f7d1/attachment.htm>
More information about the erlang-questions
mailing list