Performance of term_to_binary vs Bbinary_to_term

Lukas Larsson lukas@REDACTED
Tue Jun 8 16:21:34 CEST 2021


Hello!

On Tue, Jun 8, 2021 at 2:57 PM Richard O'Keefe <raoknz@REDACTED> wrote:

> Why would decoding a term create *any* garbage in typical cases?
> One source of garbage in my Smalltalk library is that floats are
> represented as an integer power of two scale modifying an integer
> (which might be a bignum), so the second integer (if large) is
> garbage.  But Erlang doesn't do that.  It represents a float as
> 8 binary bytes.  The reason is that my Smalltalk had to deal with
> double extended, which could be 64, 80, 96, or 128 bits, so the
> external representation had to deal with it, but Erlang supports
> 64-bit IEEE doubles only.
>
> Erlang's external format follows the ASN Type-Length-Value
> principle (more or less), so that when binary_to_term/1 reads
> something, it knows exactly what to allocate and how big.
>
> What am I missing here?
>

The garbage I was referring to is the term itself. The term "garbage" may
not have been the best choice to describe that data.

In the initial question the benchmark was done on `{a,<<1,2,3>>, b,
[1,2,3], c, {1,2,3}, d, #{a=>1, b=>2, c=>3}}`, which would create 35 words
heap data when decoded.

However, when encoded it is represented by:
`<<131,104,8,100,0,1,97,109,0,0,0,3,1,2,3,100,0,1,98,107,0,
  3,1,2,3,100,0,1,99,104,3,97,1,97,2,97,3,100,0,1,100,116,
  0,0,0,3,100,0,1,97,97,1,100,0,1,98,97,2,100,0,1,99,97,3>>`
which is only 10 words of heapdata.

So each loop in the decode benchmark would generate 3.5 times as much
garbage for the garbage collector to deal with.

Lukas
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20210608/a37cbc1d/attachment.htm>


More information about the erlang-questions mailing list