[erlang-patches] Pollset per scheduler and bind port to scheduler

Wei Cao cyg.cao@REDACTED
Wed Jul 18 11:09:52 CEST 2012


Thanks, warnings disappear after disable hipe, I then runned serveral
tests with debug build ,fixed several assertion failures found and it
works all right now.

Fixes're pushed to pollset_per_scheduler branch at
git://github.com/weicao/otp.git


2012/7/18 Lukas Larsson <lukas@REDACTED>:
> Hi!
>
> You should disable hipe before trying to compile the debug build. Also I'm
> unsure if just adding -DDEBUG to CFLAGS is enough. To build a debug emulator
> the way we do it is:
>
> (cd $ERL_TOP && ./otp_build autoconf && ./otp_build configure --disable-hipe
> && ./otp_build boot)
> (cd $ERL_TOP/erts/emulator && make FLAVOR=smp debug)
>
> and then start it using
>
> $ERL_TOP/bin/cerl -debug
>
> Lukas
>
>
> On 18/07/12 08:23, Wei Cao wrote:
>>
>> 2012/7/18 Wei Cao <cyg.cao@REDACTED>:
>>>
>>> 2012/7/17 Lukas Larsson <lukas@REDACTED>:
>>>>
>>>> Hi!
>>>>
>>>> After fixing the patch to compile on windows[1] I put it in our daily
>>>> builds
>>>
>>> Glad to hear it can compile on windows now, thanks a lot~ I tried to set
>>> up
>>> mingw etc on a windows virtual machine yesterday, but it runned so slow,
>>> and
>>> met a lot of mistakes during configuration.
>>>
>>>> and a couple of issues came up. I did not include 'move
>>>> erts_deliver_time
>>>> out of erl_check_io' commit as it deadlocks the non-smp emulator.
>>>
>>> Yeah, I can confirm this problem, there is only one OS thread
>>> innon-smp emulator,
>>> if check_io starts waiting, other threads will have no opportunity to
>>> run, including the
>>> thread I added to call erts_deliver_time() periodically, so check_io
>>> will have no timer
>>> to wake it up, and would wait endlessly.
>>>
>>> I'll consider to replace a better solution.
>>>
>>>> When running a debug test build on Linux we got the following assertion:
>>>> Assertion failed: 0 <= (ix) && (ix) < erts_no_pollsets in
>>>> sys/common/erl_check_io.c, line 1909
>>>
>>> hm... this is a very silly mistake, the variable erts_no_pollsets is
>>> set after the assertion,
>>> after fixing it(post to pollset_per_scheduler branch), I met another
>>> assertion error,
>>>
>>> erlc -W  +debug_info +inline -o../ebin hipe_icode2rtl.erl
>>> Assertion failed: VALID_INSTR(* (Eterm *)(c_p->i)) in beam/beam_emu.c,
>>> line 1241
>>>
>> I tried the latest master branch on erlang/otp(without my patch),
>> configure with
>> ./configure CFLAGS="-DDEBUG -g -O3 -fomit-frame-pointer"
>> there is the same assertion failure:
>> Assertion failed: VALID_INSTR(* (Eterm *)(c_p->i)) in beam/beam_emu.c,
>> line 1241
>>
>> However, if instead I configure with
>> ./configure CFLAGS="-DDEBUG", above assertion failures disappear,
>> instead reporting warnings like this:
>>
>> bit length overflow
>> code 12 bits 7->6
>>
>> It seems CFLAGS influence these assertions.
>>
>>>> On OS X Lion we got the following when compiling gs:
>>>> erl -pa ../ebin -s gs_make -s erlang halt -noshell
>>>> ../include/internal/ethr_mutex.h:655: Fatal error in ethr_mutex_lock():
>>>> Invalid argument (22)
>>>> make[3]: *** [gstk_generic.hrl] Abort trap: 6
>>>> make[2]: *** [opt] Error 2
>>>> make[1]: *** [opt] Error 2
>>>> make: *** [libs] Error 2
>>>
>>> I didn't have a mac machine to test it.
>>>
>>>> I'll remove the branch and see if the same problems appear again
>>>> tomorrow
>>>> (unfortunately we do not have enough machines to let your branch run
>>>> alone,
>>>> so it might be some other branch causing this). Let me know if you need
>>>> any
>>>> help tracking down these issues.
>>>>
>>>> Lukas
>>>>
>>>> [1]: https://github.com/garazdawi/otp/tree/wc/pollset_per_scheduler
>>>>
>>>>
>>>> On 11/07/12 17:34, Wei Cao wrote:
>>>>
>>>> In non keep-alive cases, all new connections 're accepted by the
>>>> Erlang port which listens on the TCP port, and how frequently/fast the
>>>> port be scheduled to run limits the QPS. (requests per second), so
>>>> this port can be regarded as bottleneck of non keep-alive
>>>> applications.
>>>>
>>>> So I guess performance degradation observed is caused by the listener
>>>> port not be scheduled frequent or fast enough, I'll look into this
>>>> problem tomorrow, now is at night in China, :-)
>>>>
>>>> BTW, I found this patch should be compiled like this today,
>>>> ./configure CFLAGS="-DERTS_POLLSET_PER_SCHEDULER -g -O3
>>>> -fomit-frame-pointer"
>>>> otherwise compiler optimization is disabled.
>>>>
>>>> Regarding binding processes/ports to scheduler, I admit it's really a
>>>> temporary solution to bind port to the same scheduler as its owner
>>>> process like the pb patch did, and I suggest it's better to add a
>>>> additional BIF like erlang:process_flag, to allow user explicitly bind
>>>> port to a given scheduler, if it benefits.
>>>>
>>>>
>>>> 2012/7/11 Lukas Larsson <lukas@REDACTED>:
>>>>
>>>> Hi,
>>>>
>>>> The reason I'm skeptical about anything which binds processes/ports to
>>>> scheduler is that it feels like a temporary solution and would much
>>>> rather
>>>> do a proper solution where the scheduler takes care of these things for
>>>> you.
>>>> But as I said, internally we need to talk this over when it is not in
>>>> the
>>>> middle of summer vacation.
>>>>
>>>> I did some benchmarking using ab and found basically the same figures as
>>>> you. The below is with keep-alive and the values are requests per
>>>> second:
>>>>
>>>>                     not-bound        bound
>>>>
>>>> R15B01                 44k        37k
>>>>
>>>> master                  44k        35k
>>>>
>>>> master+mp          48k        49k
>>>>
>>>> master+mp+pb    49k        55k
>>>>
>>>> [mp]: multi-poll patch
>>>> [pb]: port bind patch
>>>> [bound]: Used {scheduler,I} to spread load
>>>>
>>>> Unfortunately I also found that when doing the non-keep alive benchmark
>>>> the
>>>> performance is seriously degraded.
>>>>
>>>> R15B01 not-bound                  8255
>>>> master+mp+pb not-bound    7668
>>>> master+mp+pb bound           5765
>>>>
>>>> I did some gprof runs but could not find anything obvious that is going
>>>> wrong.
>>>>
>>>> Lukas
>>>>
>>>>
>>>> On 11/07/12 04:21, Wei Cao wrote:
>>>>
>>>> I added a macro to conditional compile the patch because I think it
>>>> can be more selectable, I can remove the macro, fix the compilation
>>>> error and test on mingw platform in later version.
>>>>
>>>> how about provide another BIF named port_flag (like process_flag) to
>>>> let user bind port to a given scheduler?
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>>
>>> Best,
>>>
>>> Wei Cao
>>
>>
>>
>
>



-- 

Best,

Wei Cao



More information about the erlang-patches mailing list