persistent_term to replace ETS for caching

Mon Dec 7 00:54:30 CET 2020

On Sun, Dec 6, 2020 at 3:30 PM Richard O'Keefe <raoknz@REDACTED> wrote:

> Thanks for that advice about persistent_term.
> From the documentation,
> <quote>
> When a persistent term is updated or deleted, a global garbage collection
> pass is run to scan all processes for the deleted term, and to copy it into
> each process that still uses it.
> </quote>
> Do I understand correctly that if three processes refer
> to a persistent-term, and one of them deletes it, it
> will be copied into both of the other processes, thus
> INCREASING the amount of memory used?
>

Correct.

> This is sufficiently counter-intuitive that I am not
> sure I would dare to use this feature.
>
>
This feature is intended to optimize storage of data that is stored
extremely seldom, but read very frequently. Borderline write once. The
reads become very cheap since no synchronisation between thread are needed
at all once the term is in place, and no copying of the data is needed when
passing the value between processes. This at the cost of expensive writes
(of new key/value) and *very* expensive modifications. This is something
that one has to take into account when utilizing it. persistent_term is
*not* intended to replace any of the other term storages available. If you
think of it as an ETS replacement, it may perhaps seem counter intuitive,
but this is *not* what it is intended for. When modifying persistent terms
one needs to be very careful not to cause problems for the system. I think
the documentation is quite clear on that as well.

Storage of persistent terms utilize the same functionality as storage of
literals in code. When purging a module, you may see the same effect as for
persistent terms. If a lot of processes refer to literals from the module
being purged, the purge may increase the amount of memory being used, since
the literals of the module will be copied onto the heaps of the processes
still referring the literals of the module.

In both cases the user is more or less expected to stop using the data once
it has been removed from the storage. If this is true, you will not see an
increase in memory usage due to these features.

Do I further understand correctly that *adding* a new
> persistent-term does NOT force garbage collection?
>
>
Correct. Garbage collection is only needed when we need to remove a term
that might be referred from one or more processes.

Regards,
Rickard

> On Mon, 7 Dec 2020 at 00:49, Nalin Ranjan <ranjanified@REDACTED> wrote:
>
>> It has the potential to trigger Global GC, and can affect responsiveness
>> as per the docs.
>>
>> https://erlang.org/doc/man/persistent_term.html
>>
>> Regards
>> Nalin Ranjan
>>
>> On Sun, Dec 6, 2020 at 5:16 AM Frank Muller <frank.muller.erl@REDACTED>
>> wrote:
>>
>>> Hi guys,
>>>
>>> At work, we cache about 5.3 million entries in ETS. The system works
>>> perfectly, no issue so far (many years).
>>>
>>> During a brainstorming session, a colleague suggested to switch to
>>> persistent_term instead to avoid ETS term copying.
>>>
>>> Pretty simple: we check if the Key exists in persistent_term. If yes, we
>>> are done. If not, we get it from ETS, move it to persistent_term and send
>>> it back to the caller.
>>>
>>> Question: is there any limitation(s) on persistent_term usage? Stated
>>> otherwise, can we create 5.3 million persistent_term <K,V>?
>>>
>>
>>>
>>> Any suggestion/idea/thought is very welcome.
>>>
>>> /Frank
>>>
>>

-- 
Rickard Green, Erlang/OTP, Ericsson AB
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20201207/a87e72ca/attachment.htm>