[erlang-questions] performance drop in R15A?

Nico Kruber lists.nico.k@REDACTED
Thu Apr 12 14:47:50 CEST 2012


is there any news on this?
Our internal test server has a new additional configuration which compiles the 
latest otp from github once per day and tests its performance with the 
mentioned benchmarks.
The resulting graphs for R14B04, R15B and latest git checkout are attached. As 
you can see - nothing has changed since then so I suppose it is still present 
in R15B01 :(

I don't know how to debug this any further. If there is anything I could do to 
help you identify the issue, please tell me.


Regards
Nico Kruber

On Wednesday 14 December 2011 17:41:33 Nico Kruber wrote:
> ok, I did some profiling and made some optimisations but the differences
> between R15B (git checkout from master just now) and R14B04 are still the
> same:
> 
> First the results of the calls to bench:quorum_read(10, 50000) (best of
> three) ------------------------------------------------------------
> 
>                    |   R14B04   |   R15B     | time surplus
> 
> ------------------------------------------------------------
> timer:tc, smp8:8   |    22484619|    26641737| +18%
> timer:tc, smp4:4   |    34009785|    35963489| + 6%
> timer:tc, smp2:2   |    32863428|    34102038| + 4%
> timer:tc, smpoff   |    53805419|    54214795| + 1%
> ------------------------------------------------------------
> 
> It seems there is some lock-contention or some other problem with smp in our
> case which hasn't been there before.
> I also observed that the results using R15B vary quite much, especially with
> smp8:8 (see below for an example). R14B04 is more stable among different
> test runs.
> But still, even without smp, R15B is slower than R14B04 :(
> 
> Lukas suggested to run fprof to a bit deeper and maybe find the function
> taking longer in R15 than in R14B04.
> Although it was helpful in finding some hot spots, it did however not help
> with this problem. Following are the totals reported by fprof
> (if you like, I can send the complete logs - didn't want to send too much to
> the mailing list though).
> 
> ------------------------------------------------------------
> 
>                    |   R14B04   |   R15B     | time surplus
> 
> ------------------------------------------------------------
> fprof, CNT, smp8:8 |     7723796|     7722635| n/a
> fprof, ACC, smp8:8 |       98239|       85477| -13%
> fprof, OWN, smp8:8 |      300534|      264018| -12%
> ------------------------------------------------------------
> 
> As you can see, according to fprof, R15B should be faster (but it isn't!).
> I assume that some function calls are not reported by fprof
> (e.g. I did not see erlang:length/1).
> Is there some way to include them again in order to analyse further?
> Maybe suspend or garbage_collect are not in ACC or OWN?
> But even here, R15B wastes less time as you can see in the table below.
> ----------------------------------------------------------
> 
>                                |   R14B04   |   R15B     |
> 
> ----------------------------------------------------------
> fprof, suspend, smp8:8         |  9.8421e+06| 8.51855e+06|
> fprof, garbage_collect, smp8:8 |        1315|         794|
> ----------------------------------------------------------
> 
> We do rely quite much on the process dictionary (erlang:put and erlang:get)
> which you might see as an "non-typical usage pattern" for erlang.
> However, if I switch to ets instead, performance of bench:quorum_read(10,
> 50000) for R15B does not change and for R14B04 decreases by about 7%.
> 
> Everything else I would consider "typical use" - or at least I'm not aware
> of anything untypical.
> 
> I hope, someone can help to further analyse the problem. I did everything I
> could think of and are stuck now.
> Is other real-world software also affected? E.g. testing couchdb would be
> nice.
> 
> 
> Regards
> Nico Kruber
> 
> 
> %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
> how to re-produce the values
> %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
> 
> I'm using scalaris from http://code.google.com/p/scalaris/
> ./configure && make
> ================ R15B ===============
> --------------------------------
> ./bin/firstnode.sh
> Erlang R15B (erts-5.9) [source] [64-bit] [smp:8:8] [async-threads:0] [hipe]
> [kernel-poll:false]
> 
> timer:tc(bench, quorum_read, [10, 50000]).
> {33764932,ok} % 1st try
> {26641737,ok} % 2nd try
> {27575975,ok} % 3rd try
> --------------------------------
> ./bin/firstnode.sh "+S4:4"
> Erlang R15B (erts-5.9) [source] [64-bit] [smp:4:4] [async-threads:0] [hipe]
> [kernel-poll:false]
> 
> timer:tc(bench, quorum_read, [10, 50000]).
> {35963489,ok} % 1st try
> {36332156,ok} % 2nd try
> {37295819,ok} % 3rd try
> --------------------------------
> ./bin/firstnode.sh "+S2:2"
> Erlang R15B (erts-5.9) [source] [64-bit] [smp:2:2] [async-threads:0] [hipe]
> [kernel-poll:false]
> 
> timer:tc(bench, quorum_read, [10, 50000]).
> {34102038,ok} % 1st try
> {34379052,ok} % 2nd try
> {34416173,ok} % 3rd try
> --------------------------------
> ./bin/firstnode.sh "-smp disable"
> Erlang R15B (erts-5.9) [source] [64-bit] [async-threads:0] [hipe]
> [kernel-poll:false]
> 
> timer:tc(bench, quorum_read, [10, 50000]).
> {54214795,ok} % 1st try
> {55210947,ok} % 2nd try
> {54537194,ok} % 3rd try
> 
> ================ R14B04 ===============
> --------------------------------
> ./bin/firstnode.sh
> Erlang R14B04 (erts-5.8.5) [source] [64-bit] [smp:8:8] [rq:8]
> [async-threads:0] [hipe] [kernel-poll:false]
> 
> timer:tc(bench, quorum_read, [10, 50000]).
> {22598819,ok} % 1st try
> {22706185,ok} % 2nd try
> {22484619,ok} % 3rd try
> --------------------------------
> ./bin/firstnode.sh "+S4:4"
> Erlang R14B04 (erts-5.8.5) [source] [64-bit] [smp:4:4] [rq:4]
> [async-threads:0] [hipe] [kernel-poll:false]
> 
> timer:tc(bench, quorum_read, [10, 50000]).
> {35482368,ok} % 1st try
> {34009785,ok} % 2nd try
> {34527203,ok} % 3rd try
> --------------------------------
> ./bin/firstnode.sh "+S2:2"
> Erlang R14B04 (erts-5.8.5) [source] [64-bit] [smp:2:2] [rq:2]
> [async-threads:0] [hipe] [kernel-poll:false]
> 
> timer:tc(bench, quorum_read, [10, 50000]).
> {33590356,ok} % 1st try
> {33192033,ok} % 2nd try
> {32863428,ok} % 3rd try
> --------------------------------
> ./bin/firstnode.sh "-smp disable"
> Erlang R14B04 (erts-5.8.5) [source] [64-bit] [rq:1] [async-threads:0] [hipe]
> [kernel-poll:false]
> 
> timer:tc(bench, quorum_read, [10, 50000]).
> {53805419,ok} % 1st try
> {54054538,ok} % 2nd try
> {54066521,ok} % 3rd try
> 
> 
> fprof runs:
> ./bin/firstnode.sh
> 
> fprof:trace([start, {procs, processes()}]).
> bench:quorum_read(10, 1000).
> fprof:trace(stop).
> fprof:profile().
> fprof:analyse([{dest, "test.R15B.fprof.log"}]). % or R14B04 respectively
> 
> > cat test.R14B04.fprof.log |grep suspend|cut -d',' -f3 | awk '{s+=$1} END
> > {print s}'
> 9.8421e+06
> 
> > cat test.R15B.fprof.log |grep suspend|cut -d',' -f3 | awk '{s+=$1} END
> > {print s}'
> 8.51855e+06
> 
> > cat test.R14B04.fprof.log |grep garbage_collect|cut -d',' -f3 | awk
> > '{s+=$1} END {print s}'
> 1315.33
> 
> > cat test.R15B.fprof.log |grep garbage_collect|cut -d',' -f3 | awk '{s+=$1}
> > END {print s}'
> 794.956
> 
> On Friday 09 December 2011 12:09:56 Kenneth Lundin wrote:
> > On 12/5/11, Nico Kruber <kruber@REDACTED> wrote:
> > > Is there any news on this?
> > > 
> > > The performance drop of the recent R15 checkout compared to R14B04 in
> > > our
> > > application is now at _30%_ compared to a drop of "only" 15% in R15A!
> > 
> > When working with R15B we are of course performing a number of
> > bencmarks to keep track of the performance.
> > 
> > Whe always strive to improve general performance in each release AND
> > OUR BENCHMARKS INDICATE THAT R15B IS FASTER IN GENERAL THAN R14B04.
> > 
> > It is however always possible that some specific operations or
> > combinations of operations get slower while others are optimized.
> > 
> > Regarding the findings for this particular application we think the
> > reason for slowdown is a very frequent but non typical usage pattern
> > that now becomes visible as a performance bottleneck.
> > 
> > 
> > / Kenneth, Erlang/OTP Ericsson
-------------- next part --------------
A non-text attachment was scrubbed...
Name: erlang-R15-performance.jpg
Type: image/jpeg
Size: 58032 bytes
Desc: not available
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20120412/6ba80afd/attachment.jpg>


More information about the erlang-questions mailing list