Performance of term_to_binary vs Bbinary_to_term

Tue Jun 8 10:10:08 CEST 2021

As Jacob suggested, I think the compiler has just optimised out the term_to_binary call.  On my machine, your original benchmark gives:

1> c(tconvert).
{ok,tconvert}
2> tconvert:run(a, 1000000).

term_to_binary/1 RETURN VALUE:<<131,100,0,1,97>>
REQUEST COUNT:1000000
ELAPSED TIME (usec):4707
TIME PER REQUEST (usec): 0.004707
PROJECTED RATE (req/sec): 212449543.23348203

binary_to_term/1 RETURN VALUE:a
REQUEST COUNT:1000000
ELAPSED TIME (usec):142883
TIME PER REQUEST (usec): 0.142883
PROJECTED RATE (req/sec): 6998733.229285499

So that factor-of-30 difference, similar to what you reported. I change the two loops to use the value from term_to_binary / binary_to_term:

do_term_to_bin( Term, _, 0 ) -> term_to_binary( Term );
do_term_to_bin( Term, _, N )
->
    X = term_to_binary( Term ),
    do_term_to_bin( Term, X, N-1 )
.

do_bin_to_term( <<Bin/binary>> , _, 0 ) -> binary_to_term( Bin );
do_bin_to_term( <<Bin/binary>> , _, N )
->
    X = binary_to_term( Bin ),
    do_bin_to_term( Bin , X, N-1 )
.

And that resulted in:

3> c(tconvert).             
{ok,tconvert}
4> tconvert:run(a, 1000000).

term_to_binary/1 RETURN VALUE:<<131,100,0,1,97>>
REQUEST COUNT:1000000
ELAPSED TIME (usec):68587
TIME PER REQUEST (usec): 0.068587
PROJECTED RATE (req/sec): 14580022.45323458

binary_to_term/1 RETURN VALUE:a
REQUEST COUNT:1000000
ELAPSED TIME (usec):130298
TIME PER REQUEST (usec): 0.130298
PROJECTED RATE (req/sec): 7674714.884342046

So binary_to_term about twice as slow as term_to_binary, which is very much in line with what Jacob measured

> On 8 Jun 2021, at 08:28, Valentin Micic <v@REDACTED> wrote:
> 
> As I was surprised with the measurement myself, I am sure that compiler did some significant optimisation  — I am attaching the file with the source code, so you could review it yourself.
> Also, it would be interesting to see how this performs on R22 (I haven’t installed it yet).
> 
> In my view, it doesn’t really mattar how fast the testing code is. What matter here is that there’s an order of magnitude difference in performance between the two BIFs. 
> 
> The calling syntax for the tconvert:run/2 is: tconvert:run( a, 10000000 ).
> 
> The first argument is a term to be converted, and the second represents a number of iterations — higher this number, more accurate the measurement will be (at least in my opinion).
> 
> After reading your email I’ve looked at my code again, and noticed a potential slow-down for binary_to_term/1 portion of the test. 
> 
> do_bin_to_term( <<Bin/binary>> , 0 ) -> binary_to_term( Bin );
> do_bin_to_term( <<Bin/binary>> , N )
> ->
>     binary_to_term( <<Bin/binary>> ),
>     do_bin_to_term( Bin , N-1 )
> .
> 
> 
> When written as 
> 
> do_bin_to_term( <<Bin/binary>> , 0 ) -> binary_to_term( Bin );
> do_bin_to_term( <<Bin/binary>> , N )
> ->
>     binary_to_term( Bin ),
>     do_bin_to_term( Bin , N-1 )
> .
> 
> It speeds up the code by factor 2 (well, duh! Cynic would say — so much for compiler optimisation ;-))
> 
> After this “fix”, binary_to_term/1 portion of the test runs “only” 14 times slower.
> 
> (cig@REDACTED)322> tconvert:run( a, 10000000 ).        
> 
> term_to_binary/1 RETURN VALUE:<<131,100,0,1,97>>
> REQUEST COUNT:10000000
> ELAPSED TIME (usec):94664
> TIME PER REQUEST (usec): 0.0094664
> PROJECTED RATE (req/sec): 105636778.50080284
> 
> binary_to_term/1 RETURN VALUE:a
> REQUEST COUNT:10000000
> ELAPSED TIME (usec):1385235
> TIME PER REQUEST (usec): 0.1385235
> PROJECTED RATE (req/sec): 7218991.723425989
> ok
>  
> 
> Kind regards
> 
> V/
> 
> <tconvert.erl>
> 
> 
> 
> 
>> On 08 Jun 2021, at 07:45, Jacob <jacob01@REDACTED <mailto:jacob01@REDACTED>> wrote:
>> 
>> Hi,
>> 
>> I've tried to reproduce the measurement, but according to my
>> measurements, there is just a factor of 2 on Erlang/OTP 22.
>> 
>> 1> timer:tc(fun () -> bench:t2b(a, 1000000) end)
>> {109357,<<131,100,0,1,97>>}
>> 2> timer:tc(fun () -> bench:b2t(<<131,100,0,1,97>>, 1000000) end).
>> {199488,a}
>> 
>> 
>> If I do not use the result of each term_to_binary call, the factor (~14)
>> is much closer to your measurements:
>> 
>> 3> timer:tc(fun () -> bench:broken_t2b(a, 1000000) end).
>> {14404,<<>>}
>> 
>> Are you indeed sure, that the compiler did not optimise away the entire
>> call?
>> 
>> /Jacob
>> 
>> ======================== bench.erl ==============================
>> -module(bench).
>> 
>> -export([t2b/2, b2t/2, broken_t2b/2]).
>> 
>> 
>> t2b(T, N) -> t2b(T, N, undefined).
>> 
>> t2b(_, 0, R) -> R;
>> t2b(T, N, _) -> R = term_to_binary(T), t2b(T, N-1, R).
>> 
>> b2t(T, N) -> b2t(T, N, undefined).
>> 
>> b2t(_, 0, R) -> R;
>> b2t(T, N, _) -> R = binary_to_term(T), b2t(T, N-1, R).
>> 
>> broken_t2b(T, N) -> broken_t2b(T, N, undefined).
>> 
>> broken_t2b(_, 0, R) -> R;
>> broken_t2b(T, N, R) -> _ = term_to_binary(T), broken_t2b(T, N-1, R).
>> =================================================================
>> 
>> 
>> On 06.06.21 02:07, Valentin Micic wrote:
>>> Hi all,
>>> 
>>> I did some performance measurement recently that included conversion of
>>> an arbitrary erlang term to its external binary representation via
>>> term_to_binary/1, as well as reversing the result using binary_to_term/1.
>>> 
>>> I’ve noticed that term_to_binary/1 is significantly faster than
>>> binary_to_term/1.
>>> 
>>> Also, I’ve observed that binary_to_term/1 performance gets considerably
>>> worse as complexity of specified term increases, whilst term_to_binary/1
>>> maintains (more-less) steady performance.
>>> 
>>> (cig@REDACTED)40> tconvert:run( a, 10000000 ).
>>> 
>>> term_to_binary/1 RETURN VALUE:<<131,100,0,1,97>>
>>> REQUEST COUNT:10000000
>>> ELAPSED TIME (usec):97070
>>> TIME PER REQUEST (usec): 0.009707
>>> PROJECTED RATE (req/sec): *103018440*.30081384
>>> 
>>> binary_to_term/1 RETURN VALUE:a
>>> REQUEST COUNT:10000000
>>> ELAPSED TIME (usec):3383483
>>> TIME PER REQUEST (usec): 0.3383483
>>> PROJECTED RATE (req/sec): *2955534*.2822765773
>>> ok
>>> 
>>> (cig@REDACTED)41> tconvert:run( {a,<<1,2,3>>, b, [1,2,3], c, {1,2,3},
>>> d, #{a=>1, b=>2, c=>3}}, 10000000 ).
>>> 
>>> term_to_binary/1 RETURN
>>> VALUE:<<131,104,8,100,0,1,97,109,0,0,0,3,1,2,3,100,0,1,
>>>                                
>>> 98,107,0,3,1,2,3,100,0,1,99,104,3,97,1,97,2,97,
>>>                                
>>> 3,100,0,1,100,116,0,0,0,3,100,0,1,97,97,1,100,
>>>                                 0,1,98,97,2,100,0,1,99,97,3>>
>>> REQUEST COUNT:10000000
>>> ELAPSED TIME (usec):97307
>>> TIME PER REQUEST (usec): 0.0097307
>>> PROJECTED RATE (req/sec): *102767529*.57135664
>>> 
>>> binary_to_term/1 RETURN VALUE:{a,<<1,2,3>>,
>>>                                  b,
>>>                                  [1,2,3],
>>>                                  c,
>>>                                  {1,2,3},
>>>                                  d, 
>>>                                  #{a => 1,b => 2,c => 3}}
>>> REQUEST COUNT:10000000
>>> ELAPSED TIME (usec):8747426
>>> TIME PER REQUEST (usec): 0.8747426
>>> PROJECTED RATE (req/sec): *1143193*.4377038456
>>> ok
>>> 
>>> 
>>> 
>>> I’ve performed testing on R21.1.
>>> Any thoughts?
>>> 
>>> V/
>> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20210608/ce3c8926/attachment.htm>