Performance of term_to_binary vs Bbinary_to_term
Lukas Larsson
lukas@REDACTED
Tue Jun 8 16:21:34 CEST 2021
Hello!
On Tue, Jun 8, 2021 at 2:57 PM Richard O'Keefe <raoknz@REDACTED> wrote:
> Why would decoding a term create *any* garbage in typical cases?
> One source of garbage in my Smalltalk library is that floats are
> represented as an integer power of two scale modifying an integer
> (which might be a bignum), so the second integer (if large) is
> garbage. But Erlang doesn't do that. It represents a float as
> 8 binary bytes. The reason is that my Smalltalk had to deal with
> double extended, which could be 64, 80, 96, or 128 bits, so the
> external representation had to deal with it, but Erlang supports
> 64-bit IEEE doubles only.
>
> Erlang's external format follows the ASN Type-Length-Value
> principle (more or less), so that when binary_to_term/1 reads
> something, it knows exactly what to allocate and how big.
>
> What am I missing here?
>
The garbage I was referring to is the term itself. The term "garbage" may
not have been the best choice to describe that data.
In the initial question the benchmark was done on `{a,<<1,2,3>>, b,
[1,2,3], c, {1,2,3}, d, #{a=>1, b=>2, c=>3}}`, which would create 35 words
heap data when decoded.
However, when encoded it is represented by:
`<<131,104,8,100,0,1,97,109,0,0,0,3,1,2,3,100,0,1,98,107,0,
3,1,2,3,100,0,1,99,104,3,97,1,97,2,97,3,100,0,1,100,116,
0,0,0,3,100,0,1,97,97,1,100,0,1,98,97,2,100,0,1,99,97,3>>`
which is only 10 words of heapdata.
So each loop in the decode benchmark would generate 3.5 times as much
garbage for the garbage collector to deal with.
Lukas
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20210608/a37cbc1d/attachment.htm>
More information about the erlang-questions
mailing list