[erlang-questions] term_to_binary and large data structures
Aaron Seigo
aseigo@REDACTED
Thu Jun 28 07:16:48 CEST 2018
On 2018-06-28 01:14, Fred Hebert wrote:
> On 06/27, Aaron Seigo wrote:
>> We have maps with 10k keys that strain this system and easily saturate
>> our network. This is not "big" by any modern definition. As a
>> demonstration of this to ourselves, I wrote an Elixir library that
>> serializes terms to a more space efficient format. Where
>> `term_to_binary` creates 500MB monsters, this library conveniently
>> creates a 1.5MB binary out of the exact same data.
>>
>
> Have you tried comparing when `term_to_binary(Term, [{compressed,
> 9}])'? If you can pack 500MB of data down to 1.5 MB, chances are that
> compression could do some good things on your end.
Yes, and it certainly helps but it is still larger than one would hope
for (and larger than what that POC produces), but most importantly this
only is meaningful when we control the call to `term_to_binary`. When it
is hidden behind code in OTP or a library, or an equivalent function is
generating an external term format binary, we don't get to use this
trick.
Which also brings us to the fact that the compression being used is
still zlib, while there are much better options out there. That POC
implementation uses zstd which is both faster and produces smaller
binaries than zlib.
--
Aaron
More information about the erlang-questions
mailing list