[erlang-questions] io_lib:format R16B03 performance and jsx impact

Lukas Larsson lukas@REDACTED
Tue Jan 28 17:50:21 CET 2014


Hello,

The performance loss seems to be unrelated to whether it is integer or
float. What seems to make the difference is the size of the data created.
The textual size of 50 floats are about 3 times larger than the size of the
integers you use in the benchmark. If you change it so that
erts_debug:size(Floats) is the same for both the floop and iloop (i changed
iloop seq from 50 to 162) you see the same drop in speed inbetween R15B03
and R16B.

So most probably the performance decrease has something to do with either
changes in memory allocation or garbage collection. I don't really know
what it could be and don't have the time right now to look into it. If you
want to help figure out what it is, then doing a git bisect in between
R15B03 and R16B and getting the exact commit that introduced the
performance loss would be a great help.

As a side note, using float_to_list(Float) instead of
hd(io_lib:format("~p",[Float])) in jsx_to_json.erl more than trippled the
number of floats encoded per second (10kps vs 35kps) and using
float_to_list(Float,[{decimals,4},compact]) doubled that again, giving a
total of 7.6 times greater performance (10kps vs 76kps).

Lukas


On Tue, Jan 28, 2014 at 4:57 PM, Dmitry Kolesnikov
<dmkolesnikov@REDACTED>wrote:

> Hello,
>
> Here I've compiled a small project to benchmark the issue:
>
> git clone https://github.com/fogfish/fjsx
> make
> make run
> (fjsx@REDACTED)1> fjsx:run().
>
> My results are following:
> R15B03: min 2.9K, avg 3.1K, max 3.3K
> R16B03: min 2.7K, avg 2.8K, max 3.0K
>
> (In production I do much more staff, it shown even worse degradation)
>
> I run the test on virtual machine, cent os 6 x86_64 with 4 virtual CPU
> (underlying HW MacBook Pro i5, 2.5GHz).
> Virtual CPU
> vendor_id : GenuineIntel
> cpu family : 6
> model : 58
> model name : Intel(R) Core(TM) i5-3210M CPU @ 2.50GHz
> stepping : 9
> cpu MHz : 2535.252
> cache size : 6144 KB
> physical id : 0
> siblings : 4
> core id : 3
> cpu cores : 4
> apicid : 3
> initial apicid : 3
> fpu : yes
> fpu_exception : yes
> cpuid level : 5
> wp : yes
> flags : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat
> pse36 clflush mmx fxsr sse sse2 ht syscall nx rdtscp lm constant_tsc
> rep_good pni ssse3 lahf_lm
> bogomips : 5070.50
> clflush size : 64
> cache_alignment : 64
> address sizes : 36 bits physical, 48 bits virtual
> power management:
>
> otp configuration is identical for R15 and R16
>
> R15B03: config.log
>   $ ./configure --prefix=/usr/local/otp_R15B03 --enable-threads
> --enable-smp-support --enable-kernel-poll --enable-hipe
> --disable-dynamic-ssl-lib --with-ssl=/usr/local/ssl --enable-native-libs
>
> R16B03: config.log
>   $ ./configure --prefix=/usr/local/otp_R16B03 --enable-threads
> --enable-smp-support --enable-kernel-poll --enable-hipe
> --disable-dynamic-ssl-lib --with-ssl=/usr/local/ssl --enable-native-libs
>
> I have not run the test on real HW:
>  - my mac's OTP configurations are different R16B03
> enables --enable-darwin-64bit therefor it outperforms R15
>  - my production is virtual machines
>
> I've been using eep to profile the issue. You can go same to compare R16
> and R15 differences.
>
> Best Regards,
> Dmitry
>
> On 28 Jan 2014, at 11:54, Lukas Larsson <lukas@REDACTED> wrote:
>
> Hello,
>
> The code for formatting float when doing it through io_lib:format it
> written in pure Erlang. The reason that io_lib:format is implemented in
> Erlang is because it allows much greater cross platform formatting
> capabilities, alas at the cost of performance.
>
> Why you see a performance drop in between R15B03 to R16B03 I don't know,
> if you could create a minimal reproducible benchmark that shows the
> difference that would be great.
>
> If you want to have a speedy conversion of something you know is a float
> to a textual format you should use float_to_list/binary as that is meant to
> be a fast conversion, but with less flexibility.
>
> Lukas
>
>
> On Tue, Jan 28, 2014 at 10:05 AM, Max Lapshin <max.lapshin@REDACTED>wrote:
>
>> btw, why io_lib:format is so slow? simple changing it to nif with fprintf
>> reduces cpu a lot.
>>
>> _______________________________________________
>> erlang-questions mailing list
>> erlang-questions@REDACTED
>> http://erlang.org/mailman/listinfo/erlang-questions
>>
>>
>
>
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://erlang.org/mailman/listinfo/erlang-questions
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20140128/d7b270a9/attachment.htm>


More information about the erlang-questions mailing list