[erlang-patches] Pollset per scheduler and bind port to scheduler

Lukas Larsson lukas@REDACTED
Wed Jul 18 11:19:24 CEST 2012


Great, I'll include it in our nightly builds again and let you know of 
the results.

Lukas
On 18/07/12 11:09, Wei Cao wrote:
> Thanks, warnings disappear after disable hipe, I then runned serveral
> tests with debug build ,fixed several assertion failures found and it
> works all right now.
>
> Fixes're pushed to pollset_per_scheduler branch at
> git://github.com/weicao/otp.git
>
>
> 2012/7/18 Lukas Larsson <lukas@REDACTED>:
>> Hi!
>>
>> You should disable hipe before trying to compile the debug build. Also I'm
>> unsure if just adding -DDEBUG to CFLAGS is enough. To build a debug emulator
>> the way we do it is:
>>
>> (cd $ERL_TOP && ./otp_build autoconf && ./otp_build configure --disable-hipe
>> && ./otp_build boot)
>> (cd $ERL_TOP/erts/emulator && make FLAVOR=smp debug)
>>
>> and then start it using
>>
>> $ERL_TOP/bin/cerl -debug
>>
>> Lukas
>>
>>
>> On 18/07/12 08:23, Wei Cao wrote:
>>> 2012/7/18 Wei Cao <cyg.cao@REDACTED>:
>>>> 2012/7/17 Lukas Larsson <lukas@REDACTED>:
>>>>> Hi!
>>>>>
>>>>> After fixing the patch to compile on windows[1] I put it in our daily
>>>>> builds
>>>> Glad to hear it can compile on windows now, thanks a lot~ I tried to set
>>>> up
>>>> mingw etc on a windows virtual machine yesterday, but it runned so slow,
>>>> and
>>>> met a lot of mistakes during configuration.
>>>>
>>>>> and a couple of issues came up. I did not include 'move
>>>>> erts_deliver_time
>>>>> out of erl_check_io' commit as it deadlocks the non-smp emulator.
>>>> Yeah, I can confirm this problem, there is only one OS thread
>>>> innon-smp emulator,
>>>> if check_io starts waiting, other threads will have no opportunity to
>>>> run, including the
>>>> thread I added to call erts_deliver_time() periodically, so check_io
>>>> will have no timer
>>>> to wake it up, and would wait endlessly.
>>>>
>>>> I'll consider to replace a better solution.
>>>>
>>>>> When running a debug test build on Linux we got the following assertion:
>>>>> Assertion failed: 0 <= (ix) && (ix) < erts_no_pollsets in
>>>>> sys/common/erl_check_io.c, line 1909
>>>> hm... this is a very silly mistake, the variable erts_no_pollsets is
>>>> set after the assertion,
>>>> after fixing it(post to pollset_per_scheduler branch), I met another
>>>> assertion error,
>>>>
>>>> erlc -W  +debug_info +inline -o../ebin hipe_icode2rtl.erl
>>>> Assertion failed: VALID_INSTR(* (Eterm *)(c_p->i)) in beam/beam_emu.c,
>>>> line 1241
>>>>
>>> I tried the latest master branch on erlang/otp(without my patch),
>>> configure with
>>> ./configure CFLAGS="-DDEBUG -g -O3 -fomit-frame-pointer"
>>> there is the same assertion failure:
>>> Assertion failed: VALID_INSTR(* (Eterm *)(c_p->i)) in beam/beam_emu.c,
>>> line 1241
>>>
>>> However, if instead I configure with
>>> ./configure CFLAGS="-DDEBUG", above assertion failures disappear,
>>> instead reporting warnings like this:
>>>
>>> bit length overflow
>>> code 12 bits 7->6
>>>
>>> It seems CFLAGS influence these assertions.
>>>
>>>>> On OS X Lion we got the following when compiling gs:
>>>>> erl -pa ../ebin -s gs_make -s erlang halt -noshell
>>>>> ../include/internal/ethr_mutex.h:655: Fatal error in ethr_mutex_lock():
>>>>> Invalid argument (22)
>>>>> make[3]: *** [gstk_generic.hrl] Abort trap: 6
>>>>> make[2]: *** [opt] Error 2
>>>>> make[1]: *** [opt] Error 2
>>>>> make: *** [libs] Error 2
>>>> I didn't have a mac machine to test it.
>>>>
>>>>> I'll remove the branch and see if the same problems appear again
>>>>> tomorrow
>>>>> (unfortunately we do not have enough machines to let your branch run
>>>>> alone,
>>>>> so it might be some other branch causing this). Let me know if you need
>>>>> any
>>>>> help tracking down these issues.
>>>>>
>>>>> Lukas
>>>>>
>>>>> [1]: https://github.com/garazdawi/otp/tree/wc/pollset_per_scheduler
>>>>>
>>>>>
>>>>> On 11/07/12 17:34, Wei Cao wrote:
>>>>>
>>>>> In non keep-alive cases, all new connections 're accepted by the
>>>>> Erlang port which listens on the TCP port, and how frequently/fast the
>>>>> port be scheduled to run limits the QPS. (requests per second), so
>>>>> this port can be regarded as bottleneck of non keep-alive
>>>>> applications.
>>>>>
>>>>> So I guess performance degradation observed is caused by the listener
>>>>> port not be scheduled frequent or fast enough, I'll look into this
>>>>> problem tomorrow, now is at night in China, :-)
>>>>>
>>>>> BTW, I found this patch should be compiled like this today,
>>>>> ./configure CFLAGS="-DERTS_POLLSET_PER_SCHEDULER -g -O3
>>>>> -fomit-frame-pointer"
>>>>> otherwise compiler optimization is disabled.
>>>>>
>>>>> Regarding binding processes/ports to scheduler, I admit it's really a
>>>>> temporary solution to bind port to the same scheduler as its owner
>>>>> process like the pb patch did, and I suggest it's better to add a
>>>>> additional BIF like erlang:process_flag, to allow user explicitly bind
>>>>> port to a given scheduler, if it benefits.
>>>>>
>>>>>
>>>>> 2012/7/11 Lukas Larsson <lukas@REDACTED>:
>>>>>
>>>>> Hi,
>>>>>
>>>>> The reason I'm skeptical about anything which binds processes/ports to
>>>>> scheduler is that it feels like a temporary solution and would much
>>>>> rather
>>>>> do a proper solution where the scheduler takes care of these things for
>>>>> you.
>>>>> But as I said, internally we need to talk this over when it is not in
>>>>> the
>>>>> middle of summer vacation.
>>>>>
>>>>> I did some benchmarking using ab and found basically the same figures as
>>>>> you. The below is with keep-alive and the values are requests per
>>>>> second:
>>>>>
>>>>>                      not-bound        bound
>>>>>
>>>>> R15B01                 44k        37k
>>>>>
>>>>> master                  44k        35k
>>>>>
>>>>> master+mp          48k        49k
>>>>>
>>>>> master+mp+pb    49k        55k
>>>>>
>>>>> [mp]: multi-poll patch
>>>>> [pb]: port bind patch
>>>>> [bound]: Used {scheduler,I} to spread load
>>>>>
>>>>> Unfortunately I also found that when doing the non-keep alive benchmark
>>>>> the
>>>>> performance is seriously degraded.
>>>>>
>>>>> R15B01 not-bound                  8255
>>>>> master+mp+pb not-bound    7668
>>>>> master+mp+pb bound           5765
>>>>>
>>>>> I did some gprof runs but could not find anything obvious that is going
>>>>> wrong.
>>>>>
>>>>> Lukas
>>>>>
>>>>>
>>>>> On 11/07/12 04:21, Wei Cao wrote:
>>>>>
>>>>> I added a macro to conditional compile the patch because I think it
>>>>> can be more selectable, I can remove the macro, fix the compilation
>>>>> error and test on mingw platform in later version.
>>>>>
>>>>> how about provide another BIF named port_flag (like process_flag) to
>>>>> let user bind port to a given scheduler?
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>> --
>>>>
>>>> Best,
>>>>
>>>> Wei Cao
>>>
>>>
>>
>
>





More information about the erlang-patches mailing list