[erlang-questions] ETS memory fragmentation after deleting data

Sverker Eriksson sverker@REDACTED
Thu Feb 7 22:25:10 CET 2019


Hi Dániel 
I looked at your test code and I think it can be the 'mbcs_pool' stats that are missing.
They are returned as {mbcs_pool,[{blocks_size,0}]} without carriers_size for some reason
by erlang:system_info({allocator_sizes,ets_alloc}).
Use erlang:system_info({allocator,ets_alloc}) to get mbcs_pool with both block and carrier sizes.
Another thing that might confuse is that all binaries larger than 64 bytes will be stored in binary_alloc.
/Sverker

On tor, 2019-02-07 at 15:35 +0100, Dániel Szoboszlay wrote:
> Hi,
> 
> I would like to understand some things about ETS memory fragmentation after
> deleting data. My current (probably faulty) mental model of the issue looks
> like this:
> For every object in an ETS table a block is allocated on a carrier (typically
> a multi-block carrier, unless the object is huge).
> Besides the objects themselves, the ETS table obviously needs some additional
> blocks too to describe the hash table data structure. The size of this data
> shall be small compared to the object data however (since ETS is not terribly
> space-inefficient), so I won't think about them any more.
> If I delete some objects from an ETS table, the corresponding blocks are
> deallocated. However, the rest of the objects remain in their original
> location, so the carriers cannot be deallocated (unless all of their objects
> get deleted).
> This implies that deleting a lot of data from ETS tables would lead to memory
> fragmentation.
> Since there's no way to force ETS to rearrange the objects it already stores,
> the memory remains fragmented until subsequent updates to ETS tables fill the
> gaps with new objects.
> I wrote a small test program (available here) to verify my mental model. But
> it doesn't exactly behave as I expected.
> I create an ETS table and populate it with 1M objects, where each object is
> 1027 words large.
> 
> I expect the total ETS memory use to be around 1M * 1027 * 8 bytes ~ 7835 MiB
> (the size of all other ETS tables on a newly started Erlang node is
> negligible).
> 
> And indeed I see that the total block size is ~7881 MiB and the total carrier
> size is ~7885 MiB (99.95% utilisation).
> I delete 75% of the objects randomly.
> 
> I expect the block size to go down by ~75% and the carrier size with some
> smaller value.
> 
> In practice however the block size goes down by 87%, while the carrier size
> drops by 48% (resulting in a disappointing 25% utilisation).
> Finally, I try to defragment the memory by overwriting each object that was
> left in the table with itself.
> 
> I expect this operation to have no effect on the block size, but close the gap
> between the block size and carrier size by compacting the blocks on fewer
> carriers.
> 
> In practice however the block size goes up by 91%(!!!), while the carrier size
> comes down very close to this new block size (utilisation is back at 99.56%).
> All in all, compared to the initial state in step 1, both block and carrier
> size is down by 75%.
> So here's the list of things I don't understand or know based on this
> exercise:
> How could the block size drop by 87% after deleting 75% of the data in step 2?
> Why did overwriting each object with itself resulted in almost doubling the
> block size?
> Would you consider running a select_replace to compact a table after deletions
> safe in production? E.g. doing it on a Mnesia table that's several GB-s in
> size and is actively used by Mnesia transactions. (I know the replace is
> atomic on each object, but how would a long running replace affect the
> execution time of other operations for example?)
> Step 3 helped to reclaim unused memory, but it almost doubled the used memory
> (the block size). I don't know what caused this behaviour, but is there an
> operation that would achieve the opposite effect? That is, without altering
> the contents of the table reduce the block size by 45-50%?
> Thanks,
> Daniel
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://erlang.org/mailman/listinfo/erlang-questions
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20190207/f4d31dd0/attachment.htm>


More information about the erlang-questions mailing list