[erlang-questions] Performance analysis advice
ANTHONY MOLINARO
anthonym@REDACTED
Tue Feb 18 05:25:09 CET 2014
Hi Dmitry,
Thanks, I'll try this out. I tried fprof and percept2 but unfortunately didn't have much luck sifting through the output. I'll try eep out, and I saw eflame mentioned in another message which I'll also check out. Unfortunately, I've mostly had to discover these things in a haphazard way as this email has been lost among the massive threads taking up everyones email energy ;).
For the most part, I've tried to design the application to remove contention. At the moment message passing is completely minimal, but at around 200 concurrent connections CPU spikes, so something is going on. Unfortunately I haven't quite found the right tool or the right instrumentation to point out what the issue could be. I'm wondering if there is something with my network configuration or with webmachine/mochiweb which could cause the issue. I'll probably try some simple concurrent profiling of a hello world webmachine to see if it also runs into issues as a sanity check, then layer in some of my logic from there to see if I can replicate what I'm seeing in a simpler test.
Thanks again for the pointer,
-Anthony
On Feb 17, 2014, at 1:30 AM, Dmitry Kolesnikov <dmitry@REDACTED> wrote:
> Hello,
>
> Try to profile your application.
> I found this utility very use-full to eliminate performance bottleneck in app.
> https://github.com/virtan/eep
>
> You can slightly boost you application performance by
> * compiling runtime and app into native
> * adjusting flags -sbt ... -sws ... -swt ...
>
> However, none of above would help if you have bottleneck by design. Try to profile your system to see most busy processes.
>
> Best Regards,
> Dmitry
>
> On 12 Feb 2014, at 03:04, ANTHONY MOLINARO <anthonym@REDACTED> wrote:
>
>> So '+K false' seems to cause quite a bit less lock contention, although it doesn't remove the performance knee :(
>>
>> Here's with '+K true'
>>
>> lock id #tries #collisions collisions [%] time [us] duration [%]
>> ----- --- ------- ------------ --------------- ---------- -------------
>> pollset 1 2439901 81002 3.3199 617130 0.8811
>> proc_main 159392 6402529 13807 0.2156 311906 0.4453
>> drv_ev_state 16 2373507 15719 0.6623 189350 0.2703
>> run_queue 16 9288814 58470 0.6295 136295 0.1946
>> process_table 1 457253 5411 1.1834 116726 0.1666
>> proc_link 159392 2182726 3669 0.1681 102834 0.1468
>> proc_status 159392 5879890 3079 0.0524 81224 0.1160
>> pix_lock 1024 1183 3 0.2536 75758 0.1082
>> proc_msgq 159392 4032954 3387 0.0840 35624 0.0509
>> timeofday 1 1403251 7160 0.5102 21584 0.0308
>> atom_tab 1 1948202 19 0.0010 2494 0.0036
>> timer_wheel 1 588379 1127 0.1915 1197 0.0017
>> pollset_rm_list 1 1116613 2220 0.1988 920 0.0013
>> db_hash_slot 576 642980 209 0.0325 578 0.0008
>> make_ref 1 372880 581 0.1558 554 0.0008
>> alcu_allocator 9 18660 18 0.0965 88 0.0001
>>
>> and here's with '+K false'
>>
>> lock id #tries #collisions collisions [%] time [us] duration [%]
>> ----- --- ------- ------------ --------------- ---------- -------------
>> proc_main 159151 6387799 13079 0.2047 626344 0.8941
>> proc_link 159151 2180755 3625 0.1662 132246 0.1888
>> run_queue 16 8997639 51163 0.5686 119858 0.1711
>> proc_status 159151 5837885 3160 0.0541 114003 0.1627
>> pix_lock 1024 1231 9 0.7311 96727 0.1381
>> proc_msgq 159151 4018925 3202 0.0797 83066 0.1186
>> process_table 1 452460 5544 1.2253 78856 0.1126
>> drv_ev_state 16 2454941 30392 1.2380 72814 0.1039
>> pollset 1 2424508 28413 1.1719 41359 0.0590
>> timeofday 1 870046 4161 0.4783 12728 0.0182
>> atom_tab 1 1950265 59 0.0030 6262 0.0089
>> timer_wheel 1 481116 958 0.1991 1148 0.0016
>> pollset_rm_list 1 784445 1383 0.1763 923 0.0013
>> db_hash_slot 576 645025 213 0.0330 462 0.0007
>> make_ref 1 370122 581 0.1570 378 0.0005
>> alcu_allocator 9 18229 10 0.0549 240 0.0003
>> async_enq_mtx 16 6505 1 0.0154 1 0.0000
>>
>> Unfortunately, today was one of my meeting days so I didn't get to try more than this, but it is interesting to see the effect that +K has on locking.
>>
>> I think I'll try tracing during the knee (hopefully it won't be too much overhead) to see if I can figure out why I am still seeing the spike in CPU.
>>
>> -Anthony
>>
>> On Feb 10, 2014, at 11:11 PM, ANTHONY MOLINARO <anthonym@REDACTED> wrote:
>>
>>> No, I haven't other than setting '+K true' (are there other tweaks)?
>>>
>>> One of my experiments tomorrow will be to set '+K false' to see if falling back to select doesn't use the pollset.
>>>
>>> I'm also planning to instrument the number of times I call erlang:yield/0 when I fail to get a resource. I know the documentation warns about using erlang:yield/0 but it does seem to do what I want, which is to try again in a little bit (I want to keep the connections open and retry up to a timeout, which should have the effect of also applying back pressure to upstream load balancers).
>>>
>>> Anyway, still looking for more suggestions on thing to look for.
>>>
>>> Thanks,
>>>
>>> -Anthony
>>>
>>>
>>> On Feb 10, 2014, at 6:10 PM, Anthony Ramine <n.oxyde@REDACTED> wrote:
>>>
>>>> Did you made any tweak to polling etc? This lock is owned by erl_poll.
>>>>
>>>> --
>>>> Anthony Ramine
>>>>
>>>> Le 11 févr. 2014 à 01:45, ANTHONY MOLINARO <anthonym@REDACTED> a écrit :
>>>>
>>>>> pollset 1 2369355 71578 3.0210 553249 0.7896
>>>>
>>>
>>> _______________________________________________
>>> erlang-questions mailing list
>>> erlang-questions@REDACTED
>>> http://erlang.org/mailman/listinfo/erlang-questions
>>
>> _______________________________________________
>> erlang-questions mailing list
>> erlang-questions@REDACTED
>> http://erlang.org/mailman/listinfo/erlang-questions
>
More information about the erlang-questions
mailing list