[erlang-questions] Improve performance of IO bounded server written in Erlang via having pollset for each scheduler and bind port to scheduler together with process
Thu Jul 12 11:58:37 CEST 2012
2012/7/12 Zabrane Mickael <>:
> Hi Wei,
>>> We already surpassed the 100krps on an 8-cores machine with our HTTP server
>>> (~150K rps).
>> Which erlang version did you use to get ~150k rps on 8-cores machine,
>> patched or unpatched?
> We reach the 150K on the unpatched version.
>> if it was measured on a unpatched erlang
>> version, would you mind measuring it on the patched version and let me
>> know the result?
> I didn't yet adapted our code to use VM with your patch.
> I'll keep you informed.
>> Today I found a lock bottleneck through SystemTap, trace-cmd and lcnt,
>> after fixing it, ehttpd on my 16-cores can reach 325k rps.
>> RX packets: 326117 TX packets: 326122
>> RX packets: 326845 TX packets: 326859
>> RX packets: 327983 TX packets: 327996
>> RX packets: 326651 TX packets: 326624
>> This is the upper limit of our Gigabit network card, I run ab on three
>> standalone machines to make enough pressure, I posted the fix to
>> github, have a try ~
> That's simply fantastic. Could you share your bottleneck tracking method?
> Any new VM patch to provide?
through perf top, I see there is a big percentage of time is wasted in
1894.00 16.0% _spin_lock
566.00 4.8% process_main
After dumping and doing a statisics of _spin_lock's call stack via
trace-cmd, I found most of _spin_lock is called by futex_wake, which
is invoked by pthread mutex.
Finally, I use lcnt to locate all lock collisions in erlang VM, found
the mutex timeofday is the bottleneck.
location #tries #collisions collisions [%] time
[us] duration [%]
----- --------- ------- ------------
--------------- ---------- -------------
timeofday 'beam/erl_time_sup.c':939 895234 551957
61.6551 3185159 23.5296
timeofday 'beam/erl_time_sup.c':971 408006 264498
64.8270 1473816 10.8874
the mutex timeofday is locked each time erts_check_io is invoked to
"sync the machine's idea of time", erts_check_io is executed hundreds
of thounds of times per second, so it's locked too much times, hence
I solved this problem by moving the sync operation into a standalone
thread, invoked 1 time per millisecond
More information about the erlang-questions