[erlang-questions] Improve performance of IO bounded server written in Erlang via having pollset for each scheduler and bind port to scheduler together with process
Zabrane Mickael
Thu Jul 12 13:09:02 CEST 2012
For all using ab or siege, I strongly advice you to move to a better tool.
The best one for now which support theadings and the fast libev is weighttp (from Lighttpd webserver project):
weighttp use exactly the same ab syntax. So nothing to change in your benchs.
Hope this help !
# Weighttp:
# http://redmine.lighttpd.net/projects/weighttp/wiki
1. LibEV (http://software.schmorp.de/pkg/libev.html)
cvs -z3 -d :pserver:anonymous@REDACTED/schmorpforge co libev
cd libev
aclocal && automake --add-missing && autoconf
libtoolize --copy --force --ltdl
sh autogen.sh && ./configure --prefix=/usr && make && make install
2. Weighttp (http://redmine.lighttpd.net/projects/weighttp/wiki)
git clone git://git.lighttpd.net/weighttp
cd weighttp
./waf configure
./waf build
./waf install
On Jul 12, 2012, at 1:01 PM, Zabrane Mickael wrote:
> Hi,
> Good news. With the new (today) patch:
> old bench: ~70K rps
> new bench: ~85K rps
> More than 15K rps handled now !!
> We're not far from the 100K rps ;-)
> Well done Wei.
> Regards,
> Zabrane
> On Jul 12, 2012, at 11:58 AM, Wei Cao wrote:
>> 2012/7/12 Zabrane Mickael <zabrane3@REDACTED>:
>>> Hi Wei,
>>>>> We already surpassed the 100krps on an 8-cores machine with our HTTP server
>>>>> (~150K rps).
>>>> Which erlang version did you use to get ~150k rps on 8-cores machine,
>>>> patched or unpatched?
>>> We reach the 150K on the unpatched version.
>>>> if it was measured on a unpatched erlang
>>>> version, would you mind measuring it on the patched version and let me
>>>> know the result?
>>> I didn't yet adapted our code to use VM with your patch.
>>> I'll keep you informed.
>>>> Today I found a lock bottleneck through SystemTap, trace-cmd and lcnt,
>>>> after fixing it, ehttpd on my 16-cores can reach 325k rps.
>>>> RX packets: 326117 TX packets: 326122
>>>> RX packets: 326845 TX packets: 326859
>>>> RX packets: 327983 TX packets: 327996
>>>> RX packets: 326651 TX packets: 326624
>>>> This is the upper limit of our Gigabit network card, I run ab on three
>>>> standalone machines to make enough pressure, I posted the fix to
>>>> github, have a try ~
>>> That's simply fantastic. Could you share your bottleneck tracking method?
>>> Any new VM patch to provide?
>> through perf top, I see there is a big percentage of time is wasted in
>> kernel _spin_lock
>> 1894.00 16.0% _spin_lock
>> /usr/lib/debug/lib/modules/2.6.32-131.21.1.tb477.el6.x86_64/vmlinux
>> 566.00 4.8% process_main
>> /home/mingsong.cw/erlangpps/lib/erlang/erts-5.10/bin/beam.smp
>> After dumping and doing a statisics of _spin_lock's call stack via
>> trace-cmd, I found most of _spin_lock is called by futex_wake, which
>> is invoked by pthread mutex.
>> Finally, I use lcnt to locate all lock collisions in erlang VM, found
>> the mutex timeofday is the bottleneck.
>> lock
>> location #tries #collisions collisions [%] time
>> [us] duration [%]
>> ----- --------- ------- ------------
>> --------------- ---------- -------------
>> timeofday 'beam/erl_time_sup.c':939 895234 551957
>> 61.6551 3185159 23.5296
>> timeofday 'beam/erl_time_sup.c':971 408006 264498
>> 64.8270 1473816 10.8874
>> the mutex timeofday is locked each time erts_check_io is invoked to
>> "sync the machine's idea of time", erts_check_io is executed hundreds
>> of thounds of times per second, so it's locked too much times, hence
>> reduce performance.
>> I solved this problem by moving the sync operation into a standalone
>> thread, invoked 1 time per millisecond
>>> Regards,
>>> Zabrane
>> --
>> Best,
>> Wei Cao
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20120712/8b445c34/attachment.htm>
More information about the erlang-questions
mailing list