[erlang-questions] performance drop in R15A?

Gustav Simonsson gustav.simonsson@REDACTED
Thu Apr 12 23:42:05 CEST 2012


Hi Nico,

Have you tried running the benchmarks between git bisects,
using R14B04 and R15A as endpoints?

// Gustav

Sent from my PC

----- Original Message -----
> From: "Nico Kruber" <lists.nico.k@REDACTED>
> To: erlang-questions@REDACTED
> Cc: "Lukas Larsson" <lukas@REDACTED>
> Sent: Thursday, 12 April, 2012 2:47:50 PM
> Subject: Re: [erlang-questions] performance drop in R15A?
> 
> is there any news on this?
> Our internal test server has a new additional configuration which
> compiles the
> latest otp from github once per day and tests its performance with
> the
> mentioned benchmarks.
> The resulting graphs for R14B04, R15B and latest git checkout are
> attached. As
> you can see - nothing has changed since then so I suppose it is still
> present
> in R15B01 :(
> 
> I don't know how to debug this any further. If there is anything I
> could do to
> help you identify the issue, please tell me.
> 
> 
> Regards
> Nico Kruber
> 
> On Wednesday 14 December 2011 17:41:33 Nico Kruber wrote:
> > ok, I did some profiling and made some optimisations but the
> > differences
> > between R15B (git checkout from master just now) and R14B04 are
> > still the
> > same:
> > 
> > First the results of the calls to bench:quorum_read(10, 50000)
> > (best of
> > three) ------------------------------------------------------------
> > 
> >                    |   R14B04   |   R15B     | time surplus
> > 
> > ------------------------------------------------------------
> > timer:tc, smp8:8   |    22484619|    26641737| +18%
> > timer:tc, smp4:4   |    34009785|    35963489| + 6%
> > timer:tc, smp2:2   |    32863428|    34102038| + 4%
> > timer:tc, smpoff   |    53805419|    54214795| + 1%
> > ------------------------------------------------------------
> > 
> > It seems there is some lock-contention or some other problem with
> > smp in our
> > case which hasn't been there before.
> > I also observed that the results using R15B vary quite much,
> > especially with
> > smp8:8 (see below for an example). R14B04 is more stable among
> > different
> > test runs.
> > But still, even without smp, R15B is slower than R14B04 :(
> > 
> > Lukas suggested to run fprof to a bit deeper and maybe find the
> > function
> > taking longer in R15 than in R14B04.
> > Although it was helpful in finding some hot spots, it did however
> > not help
> > with this problem. Following are the totals reported by fprof
> > (if you like, I can send the complete logs - didn't want to send
> > too much to
> > the mailing list though).
> > 
> > ------------------------------------------------------------
> > 
> >                    |   R14B04   |   R15B     | time surplus
> > 
> > ------------------------------------------------------------
> > fprof, CNT, smp8:8 |     7723796|     7722635| n/a
> > fprof, ACC, smp8:8 |       98239|       85477| -13%
> > fprof, OWN, smp8:8 |      300534|      264018| -12%
> > ------------------------------------------------------------
> > 
> > As you can see, according to fprof, R15B should be faster (but it
> > isn't!).
> > I assume that some function calls are not reported by fprof
> > (e.g. I did not see erlang:length/1).
> > Is there some way to include them again in order to analyse
> > further?
> > Maybe suspend or garbage_collect are not in ACC or OWN?
> > But even here, R15B wastes less time as you can see in the table
> > below.
> > ----------------------------------------------------------
> > 
> >                                |   R14B04   |   R15B     |
> > 
> > ----------------------------------------------------------
> > fprof, suspend, smp8:8         |  9.8421e+06| 8.51855e+06|
> > fprof, garbage_collect, smp8:8 |        1315|         794|
> > ----------------------------------------------------------
> > 
> > We do rely quite much on the process dictionary (erlang:put and
> > erlang:get)
> > which you might see as an "non-typical usage pattern" for erlang.
> > However, if I switch to ets instead, performance of
> > bench:quorum_read(10,
> > 50000) for R15B does not change and for R14B04 decreases by about
> > 7%.
> > 
> > Everything else I would consider "typical use" - or at least I'm
> > not aware
> > of anything untypical.
> > 
> > I hope, someone can help to further analyse the problem. I did
> > everything I
> > could think of and are stuck now.
> > Is other real-world software also affected? E.g. testing couchdb
> > would be
> > nice.
> > 
> > 
> > Regards
> > Nico Kruber
> > 
> > 
> > %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
> > how to re-produce the values
> > %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
> > 
> > I'm using scalaris from http://code.google.com/p/scalaris/
> > ./configure && make
> > ================ R15B ===============
> > --------------------------------
> > ./bin/firstnode.sh
> > Erlang R15B (erts-5.9) [source] [64-bit] [smp:8:8]
> > [async-threads:0] [hipe]
> > [kernel-poll:false]
> > 
> > timer:tc(bench, quorum_read, [10, 50000]).
> > {33764932,ok} % 1st try
> > {26641737,ok} % 2nd try
> > {27575975,ok} % 3rd try
> > --------------------------------
> > ./bin/firstnode.sh "+S4:4"
> > Erlang R15B (erts-5.9) [source] [64-bit] [smp:4:4]
> > [async-threads:0] [hipe]
> > [kernel-poll:false]
> > 
> > timer:tc(bench, quorum_read, [10, 50000]).
> > {35963489,ok} % 1st try
> > {36332156,ok} % 2nd try
> > {37295819,ok} % 3rd try
> > --------------------------------
> > ./bin/firstnode.sh "+S2:2"
> > Erlang R15B (erts-5.9) [source] [64-bit] [smp:2:2]
> > [async-threads:0] [hipe]
> > [kernel-poll:false]
> > 
> > timer:tc(bench, quorum_read, [10, 50000]).
> > {34102038,ok} % 1st try
> > {34379052,ok} % 2nd try
> > {34416173,ok} % 3rd try
> > --------------------------------
> > ./bin/firstnode.sh "-smp disable"
> > Erlang R15B (erts-5.9) [source] [64-bit] [async-threads:0] [hipe]
> > [kernel-poll:false]
> > 
> > timer:tc(bench, quorum_read, [10, 50000]).
> > {54214795,ok} % 1st try
> > {55210947,ok} % 2nd try
> > {54537194,ok} % 3rd try
> > 
> > ================ R14B04 ===============
> > --------------------------------
> > ./bin/firstnode.sh
> > Erlang R14B04 (erts-5.8.5) [source] [64-bit] [smp:8:8] [rq:8]
> > [async-threads:0] [hipe] [kernel-poll:false]
> > 
> > timer:tc(bench, quorum_read, [10, 50000]).
> > {22598819,ok} % 1st try
> > {22706185,ok} % 2nd try
> > {22484619,ok} % 3rd try
> > --------------------------------
> > ./bin/firstnode.sh "+S4:4"
> > Erlang R14B04 (erts-5.8.5) [source] [64-bit] [smp:4:4] [rq:4]
> > [async-threads:0] [hipe] [kernel-poll:false]
> > 
> > timer:tc(bench, quorum_read, [10, 50000]).
> > {35482368,ok} % 1st try
> > {34009785,ok} % 2nd try
> > {34527203,ok} % 3rd try
> > --------------------------------
> > ./bin/firstnode.sh "+S2:2"
> > Erlang R14B04 (erts-5.8.5) [source] [64-bit] [smp:2:2] [rq:2]
> > [async-threads:0] [hipe] [kernel-poll:false]
> > 
> > timer:tc(bench, quorum_read, [10, 50000]).
> > {33590356,ok} % 1st try
> > {33192033,ok} % 2nd try
> > {32863428,ok} % 3rd try
> > --------------------------------
> > ./bin/firstnode.sh "-smp disable"
> > Erlang R14B04 (erts-5.8.5) [source] [64-bit] [rq:1]
> > [async-threads:0] [hipe]
> > [kernel-poll:false]
> > 
> > timer:tc(bench, quorum_read, [10, 50000]).
> > {53805419,ok} % 1st try
> > {54054538,ok} % 2nd try
> > {54066521,ok} % 3rd try
> > 
> > 
> > fprof runs:
> > ./bin/firstnode.sh
> > 
> > fprof:trace([start, {procs, processes()}]).
> > bench:quorum_read(10, 1000).
> > fprof:trace(stop).
> > fprof:profile().
> > fprof:analyse([{dest, "test.R15B.fprof.log"}]). % or R14B04
> > respectively
> > 
> > > cat test.R14B04.fprof.log |grep suspend|cut -d',' -f3 | awk
> > > '{s+=$1} END
> > > {print s}'
> > 9.8421e+06
> > 
> > > cat test.R15B.fprof.log |grep suspend|cut -d',' -f3 | awk
> > > '{s+=$1} END
> > > {print s}'
> > 8.51855e+06
> > 
> > > cat test.R14B04.fprof.log |grep garbage_collect|cut -d',' -f3 |
> > > awk
> > > '{s+=$1} END {print s}'
> > 1315.33
> > 
> > > cat test.R15B.fprof.log |grep garbage_collect|cut -d',' -f3 | awk
> > > '{s+=$1}
> > > END {print s}'
> > 794.956
> > 
> > On Friday 09 December 2011 12:09:56 Kenneth Lundin wrote:
> > > On 12/5/11, Nico Kruber <kruber@REDACTED> wrote:
> > > > Is there any news on this?
> > > > 
> > > > The performance drop of the recent R15 checkout compared to
> > > > R14B04 in
> > > > our
> > > > application is now at _30%_ compared to a drop of "only" 15% in
> > > > R15A!
> > > 
> > > When working with R15B we are of course performing a number of
> > > bencmarks to keep track of the performance.
> > > 
> > > Whe always strive to improve general performance in each release
> > > AND
> > > OUR BENCHMARKS INDICATE THAT R15B IS FASTER IN GENERAL THAN
> > > R14B04.
> > > 
> > > It is however always possible that some specific operations or
> > > combinations of operations get slower while others are optimized.
> > > 
> > > Regarding the findings for this particular application we think
> > > the
> > > reason for slowdown is a very frequent but non typical usage
> > > pattern
> > > that now becomes visible as a performance bottleneck.
> > > 
> > > 
> > > / Kenneth, Erlang/OTP Ericsson
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://erlang.org/mailman/listinfo/erlang-questions
> 



More information about the erlang-questions mailing list