[erlang-questions] performance drop in R15A?

Nico Kruber lists.nico.k@REDACTED
Tue Apr 17 19:11:10 CEST 2012


I'll try to set up a script to automatically let git bisect with different otp 
versions as soon as I have time for that.
However, chances are probably slim that this has been introduced by a single 
commit, especially since the runtime options "+sbt db +swt low" reduce the 
difference between R14B04 and R15B a lot...

Nico

On Thursday 12 April 2012 22:42:05 Gustav Simonsson wrote:
> Hi Nico,
> 
> Have you tried running the benchmarks between git bisects,
> using R14B04 and R15A as endpoints?
> 
> // Gustav
> 
> Sent from my PC
> 
> ----- Original Message -----
> 
> > From: "Nico Kruber" <lists.nico.k@REDACTED>
> > To: erlang-questions@REDACTED
> > Cc: "Lukas Larsson" <lukas@REDACTED>
> > Sent: Thursday, 12 April, 2012 2:47:50 PM
> > Subject: Re: [erlang-questions] performance drop in R15A?
> > 
> > is there any news on this?
> > Our internal test server has a new additional configuration which
> > compiles the
> > latest otp from github once per day and tests its performance with
> > the
> > mentioned benchmarks.
> > The resulting graphs for R14B04, R15B and latest git checkout are
> > attached. As
> > you can see - nothing has changed since then so I suppose it is still
> > present
> > in R15B01 :(
> > 
> > I don't know how to debug this any further. If there is anything I
> > could do to
> > help you identify the issue, please tell me.
> > 
> > 
> > Regards
> > Nico Kruber
> > 
> > On Wednesday 14 December 2011 17:41:33 Nico Kruber wrote:
> > > ok, I did some profiling and made some optimisations but the
> > > differences
> > > between R15B (git checkout from master just now) and R14B04 are
> > > still the
> > > same:
> > > 
> > > First the results of the calls to bench:quorum_read(10, 50000)
> > > (best of
> > > three) ------------------------------------------------------------
> > > 
> > >                    |   R14B04   |   R15B     | time surplus
> > > 
> > > ------------------------------------------------------------
> > > timer:tc, smp8:8   |    22484619|    26641737| +18%
> > > timer:tc, smp4:4   |    34009785|    35963489| + 6%
> > > timer:tc, smp2:2   |    32863428|    34102038| + 4%
> > > timer:tc, smpoff   |    53805419|    54214795| + 1%
> > > ------------------------------------------------------------
> > > 
> > > It seems there is some lock-contention or some other problem with
> > > smp in our
> > > case which hasn't been there before.
> > > I also observed that the results using R15B vary quite much,
> > > especially with
> > > smp8:8 (see below for an example). R14B04 is more stable among
> > > different
> > > test runs.
> > > But still, even without smp, R15B is slower than R14B04 :(
> > > 
> > > Lukas suggested to run fprof to a bit deeper and maybe find the
> > > function
> > > taking longer in R15 than in R14B04.
> > > Although it was helpful in finding some hot spots, it did however
> > > not help
> > > with this problem. Following are the totals reported by fprof
> > > (if you like, I can send the complete logs - didn't want to send
> > > too much to
> > > the mailing list though).
> > > 
> > > ------------------------------------------------------------
> > > 
> > >                    |   R14B04   |   R15B     | time surplus
> > > 
> > > ------------------------------------------------------------
> > > fprof, CNT, smp8:8 |     7723796|     7722635| n/a
> > > fprof, ACC, smp8:8 |       98239|       85477| -13%
> > > fprof, OWN, smp8:8 |      300534|      264018| -12%
> > > ------------------------------------------------------------
> > > 
> > > As you can see, according to fprof, R15B should be faster (but it
> > > isn't!).
> > > I assume that some function calls are not reported by fprof
> > > (e.g. I did not see erlang:length/1).
> > > Is there some way to include them again in order to analyse
> > > further?
> > > Maybe suspend or garbage_collect are not in ACC or OWN?
> > > But even here, R15B wastes less time as you can see in the table
> > > below.
> > > ----------------------------------------------------------
> > > 
> > >                                |   R14B04   |   R15B     |
> > > 
> > > ----------------------------------------------------------
> > > fprof, suspend, smp8:8         |  9.8421e+06| 8.51855e+06|
> > > fprof, garbage_collect, smp8:8 |        1315|         794|
> > > ----------------------------------------------------------
> > > 
> > > We do rely quite much on the process dictionary (erlang:put and
> > > erlang:get)
> > > which you might see as an "non-typical usage pattern" for erlang.
> > > However, if I switch to ets instead, performance of
> > > bench:quorum_read(10,
> > > 50000) for R15B does not change and for R14B04 decreases by about
> > > 7%.
> > > 
> > > Everything else I would consider "typical use" - or at least I'm
> > > not aware
> > > of anything untypical.
> > > 
> > > I hope, someone can help to further analyse the problem. I did
> > > everything I
> > > could think of and are stuck now.
> > > Is other real-world software also affected? E.g. testing couchdb
> > > would be
> > > nice.
> > > 
> > > 
> > > Regards
> > > Nico Kruber
> > > 
> > > 
> > > %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
> > > how to re-produce the values
> > > %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
> > > 
> > > I'm using scalaris from http://code.google.com/p/scalaris/
> > > ./configure && make
> > > ================ R15B ===============
> > > --------------------------------
> > > ./bin/firstnode.sh
> > > Erlang R15B (erts-5.9) [source] [64-bit] [smp:8:8]
> > > [async-threads:0] [hipe]
> > > [kernel-poll:false]
> > > 
> > > timer:tc(bench, quorum_read, [10, 50000]).
> > > {33764932,ok} % 1st try
> > > {26641737,ok} % 2nd try
> > > {27575975,ok} % 3rd try
> > > --------------------------------
> > > ./bin/firstnode.sh "+S4:4"
> > > Erlang R15B (erts-5.9) [source] [64-bit] [smp:4:4]
> > > [async-threads:0] [hipe]
> > > [kernel-poll:false]
> > > 
> > > timer:tc(bench, quorum_read, [10, 50000]).
> > > {35963489,ok} % 1st try
> > > {36332156,ok} % 2nd try
> > > {37295819,ok} % 3rd try
> > > --------------------------------
> > > ./bin/firstnode.sh "+S2:2"
> > > Erlang R15B (erts-5.9) [source] [64-bit] [smp:2:2]
> > > [async-threads:0] [hipe]
> > > [kernel-poll:false]
> > > 
> > > timer:tc(bench, quorum_read, [10, 50000]).
> > > {34102038,ok} % 1st try
> > > {34379052,ok} % 2nd try
> > > {34416173,ok} % 3rd try
> > > --------------------------------
> > > ./bin/firstnode.sh "-smp disable"
> > > Erlang R15B (erts-5.9) [source] [64-bit] [async-threads:0] [hipe]
> > > [kernel-poll:false]
> > > 
> > > timer:tc(bench, quorum_read, [10, 50000]).
> > > {54214795,ok} % 1st try
> > > {55210947,ok} % 2nd try
> > > {54537194,ok} % 3rd try
> > > 
> > > ================ R14B04 ===============
> > > --------------------------------
> > > ./bin/firstnode.sh
> > > Erlang R14B04 (erts-5.8.5) [source] [64-bit] [smp:8:8] [rq:8]
> > > [async-threads:0] [hipe] [kernel-poll:false]
> > > 
> > > timer:tc(bench, quorum_read, [10, 50000]).
> > > {22598819,ok} % 1st try
> > > {22706185,ok} % 2nd try
> > > {22484619,ok} % 3rd try
> > > --------------------------------
> > > ./bin/firstnode.sh "+S4:4"
> > > Erlang R14B04 (erts-5.8.5) [source] [64-bit] [smp:4:4] [rq:4]
> > > [async-threads:0] [hipe] [kernel-poll:false]
> > > 
> > > timer:tc(bench, quorum_read, [10, 50000]).
> > > {35482368,ok} % 1st try
> > > {34009785,ok} % 2nd try
> > > {34527203,ok} % 3rd try
> > > --------------------------------
> > > ./bin/firstnode.sh "+S2:2"
> > > Erlang R14B04 (erts-5.8.5) [source] [64-bit] [smp:2:2] [rq:2]
> > > [async-threads:0] [hipe] [kernel-poll:false]
> > > 
> > > timer:tc(bench, quorum_read, [10, 50000]).
> > > {33590356,ok} % 1st try
> > > {33192033,ok} % 2nd try
> > > {32863428,ok} % 3rd try
> > > --------------------------------
> > > ./bin/firstnode.sh "-smp disable"
> > > Erlang R14B04 (erts-5.8.5) [source] [64-bit] [rq:1]
> > > [async-threads:0] [hipe]
> > > [kernel-poll:false]
> > > 
> > > timer:tc(bench, quorum_read, [10, 50000]).
> > > {53805419,ok} % 1st try
> > > {54054538,ok} % 2nd try
> > > {54066521,ok} % 3rd try
> > > 
> > > 
> > > fprof runs:
> > > ./bin/firstnode.sh
> > > 
> > > fprof:trace([start, {procs, processes()}]).
> > > bench:quorum_read(10, 1000).
> > > fprof:trace(stop).
> > > fprof:profile().
> > > fprof:analyse([{dest, "test.R15B.fprof.log"}]). % or R14B04
> > > respectively
> > > 
> > > > cat test.R14B04.fprof.log |grep suspend|cut -d',' -f3 | awk
> > > > '{s+=$1} END
> > > > {print s}'
> > > 
> > > 9.8421e+06
> > > 
> > > > cat test.R15B.fprof.log |grep suspend|cut -d',' -f3 | awk
> > > > '{s+=$1} END
> > > > {print s}'
> > > 
> > > 8.51855e+06
> > > 
> > > > cat test.R14B04.fprof.log |grep garbage_collect|cut -d',' -f3 |
> > > > awk
> > > > '{s+=$1} END {print s}'
> > > 
> > > 1315.33
> > > 
> > > > cat test.R15B.fprof.log |grep garbage_collect|cut -d',' -f3 | awk
> > > > '{s+=$1}
> > > > END {print s}'
> > > 
> > > 794.956
> > > 
> > > On Friday 09 December 2011 12:09:56 Kenneth Lundin wrote:
> > > > On 12/5/11, Nico Kruber <kruber@REDACTED> wrote:
> > > > > Is there any news on this?
> > > > > 
> > > > > The performance drop of the recent R15 checkout compared to
> > > > > R14B04 in
> > > > > our
> > > > > application is now at _30%_ compared to a drop of "only" 15% in
> > > > > R15A!
> > > > 
> > > > When working with R15B we are of course performing a number of
> > > > bencmarks to keep track of the performance.
> > > > 
> > > > Whe always strive to improve general performance in each release
> > > > AND
> > > > OUR BENCHMARKS INDICATE THAT R15B IS FASTER IN GENERAL THAN
> > > > R14B04.
> > > > 
> > > > It is however always possible that some specific operations or
> > > > combinations of operations get slower while others are optimized.
> > > > 
> > > > Regarding the findings for this particular application we think
> > > > the
> > > > reason for slowdown is a very frequent but non typical usage
> > > > pattern
> > > > that now becomes visible as a performance bottleneck.
> > > > 
> > > > 
> > > > / Kenneth, Erlang/OTP Ericsson
> > 
> > _______________________________________________
> > erlang-questions mailing list
> > erlang-questions@REDACTED
> > http://erlang.org/mailman/listinfo/erlang-questions



More information about the erlang-questions mailing list