OTP22 performance on lists

Thu Jul 2 10:33:59 CEST 2020

Hi Dániel,

Thanks for checking!
I can definitely tell it's not that OTP21 bug. In all my test environments
(being those in my machine or in the servers), this returns *1* everytime:
erlang:convert_time_unit(1000, native, microsecond).

You got your results with smaller samples… So, I tried with smaller
samples, too… And you're right! OTP22 seems faster with those.
For instance,

test:bench({test, rec}, 2500, 500, 25). % reports avg times over 100µs in
OTP21, but under 70µs in OTP22…

…but if I increase the size of the individual lists to concat…

test:bench({test, rec}, 2500, 500, 200). % reports ~550µs in OTP21 /
~1300µs in OTP22

…and if I increase the # of lists…

test:bench({test, rec}, 2500, 5000, 25). % reports ~600µs in OTP21 and
it *jumps
between* 700µs and 1400µs in OTP22 *(O_O)!!*

This is OTP22… and it happened multiple times…
72> [test:bench({test, rec}, 2500, 5000, 25) || _ <- lists:seq(1,10)].
[1296.7716,*700.2968*,1317.4632,*698.1416*,1316.9648,*696.0352*,
 1294.4548,*706.8484*,1304.8368,*697.4452*]
How is that *shaking *even possible?
Aaanyway… if I only run the code 250 times (instead of 2500)… OTP22 is
consistently in *~1800µs*.

So, extrapolating wildly from very incomplete data… the problem seems to
present itself when applying ++  to either larger lists or to a large # of
lists (which, in time, results in ++'ing larger lists too… although AFAICT
the time it takes to run ++ is more affected by the length of the first
list and not the second one).

I'm still puzzled…

On Thu, Jul 2, 2020 at 1:43 AM Dániel Szoboszlay <dszoboszlay@REDACTED>
wrote:

> Hi Brujo,
>
> I tried to reproduce your results, and for me OTP 22 seems to be
> consistently faster for rec and about the same for lrec:
>
> Test case     Min         Max        Median     Average
> rec  / OTP 21 124.170us   305.990us  127.120us  128.688us
> rec  / OTP 22  62.570us   289.190us   67.610us   68.976us
> lrec / OTP 21  16.110us    22.570us   17.790us   17.881us
> lrec / OTP 22  15.390us    22.710us   17.170us   17.441us
>
> (These are times for single invocation of the functions for much smaller
> inputs than in your example.)
>
> However, there are some OTP 21 versions where due to a configuration bug
> the resolution of the native clock was much lower. For example erlang:convert_time_unit(1000,
> native, microsecond). normally gives 1 on my machine (so the native clock
> has nanosecond resolution), and on the buggy OTP 21 versions I get 610. On
> these OTP 21 versions I see that your rec benchmark seems to run ~6% faster
> than with OTP 22, but this is measurement error due to the lower clock
> resolution.
>
> Could you please check you have the same clock resolution on both OTP
> versions?
>
> Cheers,
> Daniel
>
> On Wed, 1 Jul 2020 at 12:56, Fernando Benavides <elbrujohalcon@REDACTED>
> wrote:
>
>> (Sorry if you receive this email twice, I sent it from another account
>> and it seems like it didn't reach the mailing list then)
>>
>> Hi erlangers,
>>
>> Yeah! It's me talking about performance, you read that subject right ��‍♂️
>> Anyway… At NextRoll we're in the process of migrating our systems from
>> OTP21 to OTP22 (not 23, yet) and our tests showed a *huge* impact on
>> performance in general, that was not associated to anything in particular.
>>
>> While trying to figure out what was causing it, we came up with a very
>> very basic example of things that are consistently slower in OTP22.
>>
>> I created this gist to show it:
>> https://gist.github.com/elbrujohalcon/d4e995fbc4b93fadddfd1f0d6b9f8121
>>
>> I'm aware that these kinds of micro-benchmarks are treacherous and they
>> may vary wildly depending on context. Nevertheless, on both on one of our
>> servers in AWS and on my machine (A MacOS Pro running Catalina 10.15.5) and
>> using kerl to install multiple versions of OTP 21 and 22… every single time
>> I run the tests I found the same results…
>>
>> Always starting the nodes with *erl -boot start_clean *then running the
>> following in both OTP21 and OTP22…
>>
>> c(test), test:bench({test, lrec}, 250, 5000, 2000).
>>
>> …generates very similar numbers in both versions, regardless of the
>> numbers used for the different parameters. But…
>>
>> c(test), test:bench({test, rec}, 250, 5000, 2000).
>>
>> …consistently generates larger results in OTP22 than OTP21. I tried with
>> different values for the number of tests, the number of lists and their
>> length and sometimes the difference is more evident, sometimes less… but
>> OTP22 times are *always* larger.
>>
>> I found this in the OTP22 readme…
>>
>>   OTP-15427    Application(s): erts
>>                Appending lists (The ++ operator) will now yield
>>                properly on large inputs.
>>
>> So… questions…
>> 1. Has anybody experienced (and hopefully solved) this problem before
>> when migrating to OTP22?
>> 2. Do you think OTP-15427 can be related to what I'm seeing?
>> 3. Can someone confirm if you also experience the same difference in
>> performance when running the same benchmarks that I pasted on that gist?
>> 4. Is there anything else I should try/test/use to check?
>> 5. Am I going slightly mad?
>>
>> Thanks in advance, cheers :)
>>
>>
>> --
>>
>> <https://about.me/elbrujohalcon?promo=email_sig&utm_source=product&utm_medium=email_sig&utm_campaign=gmail_api&utm_content=thumb>
>> Brujo Benavides
>> about.me/elbrujohalcon
>> <https://about.me/elbrujohalcon?promo=email_sig&utm_source=product&utm_medium=email_sig&utm_campaign=gmail_api&utm_content=thumb>
>>
>

-- 
<https://about.me/elbrujohalcon?promo=email_sig&utm_source=product&utm_medium=email_sig&utm_campaign=gmail_api&utm_content=thumb>
Brujo Benavides
about.me/elbrujohalcon
<https://about.me/elbrujohalcon?promo=email_sig&utm_source=product&utm_medium=email_sig&utm_campaign=gmail_api&utm_content=thumb>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20200702/77261caf/attachment.htm>