[erlang-questions] Performance analysis advice

Tue Feb 18 05:25:09 CET 2014

Hi Dmitry,

Thanks, I'll try this out.  I tried fprof and percept2 but unfortunately didn't have much luck sifting through the output.  I'll try eep out, and I saw eflame mentioned in another message which I'll also check out.  Unfortunately, I've mostly had to discover these things in a haphazard way as this email has been lost among the massive threads taking up everyones email energy ;).

For the most part, I've tried to design the application to remove contention.  At the moment message passing is completely minimal, but at around 200 concurrent connections CPU spikes, so something is going on.  Unfortunately I haven't quite found the right tool or the right instrumentation to point out what the issue could be.  I'm wondering if there is something with my network configuration or with webmachine/mochiweb which could cause the issue.  I'll probably try some simple concurrent profiling of a hello world webmachine to see if it also runs into issues as a sanity check, then layer in some of my logic from there to see if I can replicate what I'm seeing in a simpler test.

Thanks again for the pointer,

-Anthony

On Feb 17, 2014, at 1:30 AM, Dmitry Kolesnikov <dmitry@REDACTED> wrote:

> Hello,
> 
> Try to profile your application.
> I found this utility very use-full to eliminate performance bottleneck in app.
> https://github.com/virtan/eep
> 
> You can slightly boost you application performance by
> * compiling runtime and app into native
> * adjusting flags -sbt ... -sws ... -swt ...
> 
> However, none of above would help if you have bottleneck by design. Try to profile your system to see most busy processes.
> 
> Best Regards, 
> Dmitry
> 
> On 12 Feb 2014, at 03:04, ANTHONY MOLINARO <anthonym@REDACTED> wrote:
> 
>> So '+K false' seems to cause quite a bit less lock contention, although it doesn't remove the performance knee :(
>> 
>> Here's with '+K true'
>> 
>>           lock     id  #tries  #collisions  collisions [%]  time [us]  duration [%]
>>          -----    --- ------- ------------ --------------- ---------- -------------
>>        pollset      1 2439901        81002          3.3199     617130        0.8811
>>      proc_main 159392 6402529        13807          0.2156     311906        0.4453
>>   drv_ev_state     16 2373507        15719          0.6623     189350        0.2703
>>      run_queue     16 9288814        58470          0.6295     136295        0.1946
>>  process_table      1  457253         5411          1.1834     116726        0.1666
>>      proc_link 159392 2182726         3669          0.1681     102834        0.1468
>>    proc_status 159392 5879890         3079          0.0524      81224        0.1160
>>       pix_lock   1024    1183            3          0.2536      75758        0.1082
>>      proc_msgq 159392 4032954         3387          0.0840      35624        0.0509
>>      timeofday      1 1403251         7160          0.5102      21584        0.0308
>>       atom_tab      1 1948202           19          0.0010       2494        0.0036
>>    timer_wheel      1  588379         1127          0.1915       1197        0.0017
>> pollset_rm_list      1 1116613         2220          0.1988        920        0.0013
>>   db_hash_slot    576  642980          209          0.0325        578        0.0008
>>       make_ref      1  372880          581          0.1558        554        0.0008
>> alcu_allocator      9   18660           18          0.0965         88        0.0001
>> 
>> and here's with '+K false'
>> 
>>           lock     id  #tries  #collisions  collisions [%]  time [us]  duration [%]
>>          -----    --- ------- ------------ --------------- ---------- -------------
>>      proc_main 159151 6387799        13079          0.2047     626344        0.8941
>>      proc_link 159151 2180755         3625          0.1662     132246        0.1888
>>      run_queue     16 8997639        51163          0.5686     119858        0.1711
>>    proc_status 159151 5837885         3160          0.0541     114003        0.1627
>>       pix_lock   1024    1231            9          0.7311      96727        0.1381
>>      proc_msgq 159151 4018925         3202          0.0797      83066        0.1186
>>  process_table      1  452460         5544          1.2253      78856        0.1126
>>   drv_ev_state     16 2454941        30392          1.2380      72814        0.1039
>>        pollset      1 2424508        28413          1.1719      41359        0.0590
>>      timeofday      1  870046         4161          0.4783      12728        0.0182
>>       atom_tab      1 1950265           59          0.0030       6262        0.0089
>>    timer_wheel      1  481116          958          0.1991       1148        0.0016
>> pollset_rm_list      1  784445         1383          0.1763        923        0.0013
>>   db_hash_slot    576  645025          213          0.0330        462        0.0007
>>       make_ref      1  370122          581          0.1570        378        0.0005
>> alcu_allocator      9   18229           10          0.0549        240        0.0003
>>  async_enq_mtx     16    6505            1          0.0154          1        0.0000
>> 
>> Unfortunately, today was one of my meeting days so I didn't get to try more than this, but it is interesting to see the  effect that +K has on locking.
>> 
>> I think I'll try tracing during the knee (hopefully it won't be too much overhead) to see if I can figure out why I am still seeing the spike in CPU.
>> 
>> -Anthony
>> 
>> On Feb 10, 2014, at 11:11 PM, ANTHONY MOLINARO <anthonym@REDACTED> wrote:
>> 
>>> No, I haven't other than setting '+K true' (are there other tweaks)?
>>> 
>>> One of my experiments tomorrow will be to set '+K false' to see if falling back to select doesn't use the pollset.
>>> 
>>> I'm also planning to instrument the number of times I call erlang:yield/0 when I fail to get a resource.  I know the documentation warns about using erlang:yield/0 but it does seem to do what I want, which is to try again in a little bit (I want to keep the connections open and retry up to a timeout, which should have the effect of also applying back pressure to upstream load balancers).
>>> 
>>> Anyway, still looking for more suggestions on thing to look for.
>>> 
>>> Thanks,
>>> 
>>> -Anthony
>>> 
>>> 
>>> On Feb 10, 2014, at 6:10 PM, Anthony Ramine <n.oxyde@REDACTED> wrote:
>>> 
>>>> Did you made any tweak to polling etc? This lock is owned by erl_poll.
>>>> 
>>>> -- 
>>>> Anthony Ramine
>>>> 
>>>> Le 11 févr. 2014 à 01:45, ANTHONY MOLINARO <anthonym@REDACTED> a écrit :
>>>> 
>>>>>     pollset      1 2369355        71578          3.0210     553249        0.7896
>>>> 
>>> 
>>> _______________________________________________
>>> erlang-questions mailing list
>>> erlang-questions@REDACTED
>>> http://erlang.org/mailman/listinfo/erlang-questions
>> 
>> _______________________________________________
>> erlang-questions mailing list
>> erlang-questions@REDACTED
>> http://erlang.org/mailman/listinfo/erlang-questions
>