[erlang-questions] performance drop in R15A?
Gustav Simonsson
gustav.simonsson@REDACTED
Thu Apr 12 23:42:05 CEST 2012
Hi Nico,
Have you tried running the benchmarks between git bisects,
using R14B04 and R15A as endpoints?
// Gustav
Sent from my PC
----- Original Message -----
> From: "Nico Kruber" <lists.nico.k@REDACTED>
> To: erlang-questions@REDACTED
> Cc: "Lukas Larsson" <lukas@REDACTED>
> Sent: Thursday, 12 April, 2012 2:47:50 PM
> Subject: Re: [erlang-questions] performance drop in R15A?
>
> is there any news on this?
> Our internal test server has a new additional configuration which
> compiles the
> latest otp from github once per day and tests its performance with
> the
> mentioned benchmarks.
> The resulting graphs for R14B04, R15B and latest git checkout are
> attached. As
> you can see - nothing has changed since then so I suppose it is still
> present
> in R15B01 :(
>
> I don't know how to debug this any further. If there is anything I
> could do to
> help you identify the issue, please tell me.
>
>
> Regards
> Nico Kruber
>
> On Wednesday 14 December 2011 17:41:33 Nico Kruber wrote:
> > ok, I did some profiling and made some optimisations but the
> > differences
> > between R15B (git checkout from master just now) and R14B04 are
> > still the
> > same:
> >
> > First the results of the calls to bench:quorum_read(10, 50000)
> > (best of
> > three) ------------------------------------------------------------
> >
> > | R14B04 | R15B | time surplus
> >
> > ------------------------------------------------------------
> > timer:tc, smp8:8 | 22484619| 26641737| +18%
> > timer:tc, smp4:4 | 34009785| 35963489| + 6%
> > timer:tc, smp2:2 | 32863428| 34102038| + 4%
> > timer:tc, smpoff | 53805419| 54214795| + 1%
> > ------------------------------------------------------------
> >
> > It seems there is some lock-contention or some other problem with
> > smp in our
> > case which hasn't been there before.
> > I also observed that the results using R15B vary quite much,
> > especially with
> > smp8:8 (see below for an example). R14B04 is more stable among
> > different
> > test runs.
> > But still, even without smp, R15B is slower than R14B04 :(
> >
> > Lukas suggested to run fprof to a bit deeper and maybe find the
> > function
> > taking longer in R15 than in R14B04.
> > Although it was helpful in finding some hot spots, it did however
> > not help
> > with this problem. Following are the totals reported by fprof
> > (if you like, I can send the complete logs - didn't want to send
> > too much to
> > the mailing list though).
> >
> > ------------------------------------------------------------
> >
> > | R14B04 | R15B | time surplus
> >
> > ------------------------------------------------------------
> > fprof, CNT, smp8:8 | 7723796| 7722635| n/a
> > fprof, ACC, smp8:8 | 98239| 85477| -13%
> > fprof, OWN, smp8:8 | 300534| 264018| -12%
> > ------------------------------------------------------------
> >
> > As you can see, according to fprof, R15B should be faster (but it
> > isn't!).
> > I assume that some function calls are not reported by fprof
> > (e.g. I did not see erlang:length/1).
> > Is there some way to include them again in order to analyse
> > further?
> > Maybe suspend or garbage_collect are not in ACC or OWN?
> > But even here, R15B wastes less time as you can see in the table
> > below.
> > ----------------------------------------------------------
> >
> > | R14B04 | R15B |
> >
> > ----------------------------------------------------------
> > fprof, suspend, smp8:8 | 9.8421e+06| 8.51855e+06|
> > fprof, garbage_collect, smp8:8 | 1315| 794|
> > ----------------------------------------------------------
> >
> > We do rely quite much on the process dictionary (erlang:put and
> > erlang:get)
> > which you might see as an "non-typical usage pattern" for erlang.
> > However, if I switch to ets instead, performance of
> > bench:quorum_read(10,
> > 50000) for R15B does not change and for R14B04 decreases by about
> > 7%.
> >
> > Everything else I would consider "typical use" - or at least I'm
> > not aware
> > of anything untypical.
> >
> > I hope, someone can help to further analyse the problem. I did
> > everything I
> > could think of and are stuck now.
> > Is other real-world software also affected? E.g. testing couchdb
> > would be
> > nice.
> >
> >
> > Regards
> > Nico Kruber
> >
> >
> > %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
> > how to re-produce the values
> > %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
> >
> > I'm using scalaris from http://code.google.com/p/scalaris/
> > ./configure && make
> > ================ R15B ===============
> > --------------------------------
> > ./bin/firstnode.sh
> > Erlang R15B (erts-5.9) [source] [64-bit] [smp:8:8]
> > [async-threads:0] [hipe]
> > [kernel-poll:false]
> >
> > timer:tc(bench, quorum_read, [10, 50000]).
> > {33764932,ok} % 1st try
> > {26641737,ok} % 2nd try
> > {27575975,ok} % 3rd try
> > --------------------------------
> > ./bin/firstnode.sh "+S4:4"
> > Erlang R15B (erts-5.9) [source] [64-bit] [smp:4:4]
> > [async-threads:0] [hipe]
> > [kernel-poll:false]
> >
> > timer:tc(bench, quorum_read, [10, 50000]).
> > {35963489,ok} % 1st try
> > {36332156,ok} % 2nd try
> > {37295819,ok} % 3rd try
> > --------------------------------
> > ./bin/firstnode.sh "+S2:2"
> > Erlang R15B (erts-5.9) [source] [64-bit] [smp:2:2]
> > [async-threads:0] [hipe]
> > [kernel-poll:false]
> >
> > timer:tc(bench, quorum_read, [10, 50000]).
> > {34102038,ok} % 1st try
> > {34379052,ok} % 2nd try
> > {34416173,ok} % 3rd try
> > --------------------------------
> > ./bin/firstnode.sh "-smp disable"
> > Erlang R15B (erts-5.9) [source] [64-bit] [async-threads:0] [hipe]
> > [kernel-poll:false]
> >
> > timer:tc(bench, quorum_read, [10, 50000]).
> > {54214795,ok} % 1st try
> > {55210947,ok} % 2nd try
> > {54537194,ok} % 3rd try
> >
> > ================ R14B04 ===============
> > --------------------------------
> > ./bin/firstnode.sh
> > Erlang R14B04 (erts-5.8.5) [source] [64-bit] [smp:8:8] [rq:8]
> > [async-threads:0] [hipe] [kernel-poll:false]
> >
> > timer:tc(bench, quorum_read, [10, 50000]).
> > {22598819,ok} % 1st try
> > {22706185,ok} % 2nd try
> > {22484619,ok} % 3rd try
> > --------------------------------
> > ./bin/firstnode.sh "+S4:4"
> > Erlang R14B04 (erts-5.8.5) [source] [64-bit] [smp:4:4] [rq:4]
> > [async-threads:0] [hipe] [kernel-poll:false]
> >
> > timer:tc(bench, quorum_read, [10, 50000]).
> > {35482368,ok} % 1st try
> > {34009785,ok} % 2nd try
> > {34527203,ok} % 3rd try
> > --------------------------------
> > ./bin/firstnode.sh "+S2:2"
> > Erlang R14B04 (erts-5.8.5) [source] [64-bit] [smp:2:2] [rq:2]
> > [async-threads:0] [hipe] [kernel-poll:false]
> >
> > timer:tc(bench, quorum_read, [10, 50000]).
> > {33590356,ok} % 1st try
> > {33192033,ok} % 2nd try
> > {32863428,ok} % 3rd try
> > --------------------------------
> > ./bin/firstnode.sh "-smp disable"
> > Erlang R14B04 (erts-5.8.5) [source] [64-bit] [rq:1]
> > [async-threads:0] [hipe]
> > [kernel-poll:false]
> >
> > timer:tc(bench, quorum_read, [10, 50000]).
> > {53805419,ok} % 1st try
> > {54054538,ok} % 2nd try
> > {54066521,ok} % 3rd try
> >
> >
> > fprof runs:
> > ./bin/firstnode.sh
> >
> > fprof:trace([start, {procs, processes()}]).
> > bench:quorum_read(10, 1000).
> > fprof:trace(stop).
> > fprof:profile().
> > fprof:analyse([{dest, "test.R15B.fprof.log"}]). % or R14B04
> > respectively
> >
> > > cat test.R14B04.fprof.log |grep suspend|cut -d',' -f3 | awk
> > > '{s+=$1} END
> > > {print s}'
> > 9.8421e+06
> >
> > > cat test.R15B.fprof.log |grep suspend|cut -d',' -f3 | awk
> > > '{s+=$1} END
> > > {print s}'
> > 8.51855e+06
> >
> > > cat test.R14B04.fprof.log |grep garbage_collect|cut -d',' -f3 |
> > > awk
> > > '{s+=$1} END {print s}'
> > 1315.33
> >
> > > cat test.R15B.fprof.log |grep garbage_collect|cut -d',' -f3 | awk
> > > '{s+=$1}
> > > END {print s}'
> > 794.956
> >
> > On Friday 09 December 2011 12:09:56 Kenneth Lundin wrote:
> > > On 12/5/11, Nico Kruber <kruber@REDACTED> wrote:
> > > > Is there any news on this?
> > > >
> > > > The performance drop of the recent R15 checkout compared to
> > > > R14B04 in
> > > > our
> > > > application is now at _30%_ compared to a drop of "only" 15% in
> > > > R15A!
> > >
> > > When working with R15B we are of course performing a number of
> > > bencmarks to keep track of the performance.
> > >
> > > Whe always strive to improve general performance in each release
> > > AND
> > > OUR BENCHMARKS INDICATE THAT R15B IS FASTER IN GENERAL THAN
> > > R14B04.
> > >
> > > It is however always possible that some specific operations or
> > > combinations of operations get slower while others are optimized.
> > >
> > > Regarding the findings for this particular application we think
> > > the
> > > reason for slowdown is a very frequent but non typical usage
> > > pattern
> > > that now becomes visible as a performance bottleneck.
> > >
> > >
> > > / Kenneth, Erlang/OTP Ericsson
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://erlang.org/mailman/listinfo/erlang-questions
>
More information about the erlang-questions
mailing list