[erlang-questions] ets vs list

Morten Krogh <>
Mon Sep 13 17:16:17 CEST 2010


  Ulf and Robert,
thanks for your answers. It makes sense that delete calls free in the 
malloc sense.

What I don't get now, is what data the data in ets that is not garbage 
collected,  actually is.
Does the ets process have data beyond the table itself?
Are we talking about intermediate values used in the match function etc?

Morten.




On 9/13/10 4:58 PM, Robert Virding wrote:
> All data in an ets table will be garbage collected! It is perfectly
> safe to insert and delete data as often as you need.
>
> What Ulf meant was that an ets table is not garbage collected in the
> same way as a normal process and that you don't run into same problems
> with overly long garbage collections times as you can with a processes
> that have a *LOT* of local data. This is why it is safe to have large
> ets tables and why you should be careful in having *LARGE* databases
> as local data to a process. This is a problem using dict, gb_trees and
> array, and lists.
>
> The downside of this is that data has to be copied from an ets table
> into the process heap before it can be used, which is why we have
> ets:match and ets:select which allow you to perform more tests on
> potential data before copying them into the process heap. This is a
> benefit using dict, gb_trees and array, and lists.
>
> So they are all *safe* but perform differently.
>
> Robert
>
>
> On 13 September 2010 16:41, Morten Krogh<>  wrote:
>>   What exactly does it mean that data residing in ets isn't garbage
>> collected?
>>
>> Does it mean that if a {key, Value} pair is in the ets table T,
>> and I write
>>
>> ets:delete(T, Key)
>>
>> then Value will not be garbage collected.
>>
>> Morten.
>>
>> On 9/13/10 4:29 PM, Ulf Wiger wrote:
>>> On 09/13/2010 02:58 PM, Max Lapshin wrote:
>>>> 2) what ets is for, if list is fast enough on inserts?
>>> The main unique feature of ets is that data residing in
>>> ets is not garbage-collected.
>>>
>>> Roughly speaking, the cost of GC is proportional to the
>>> amount of live data, so if you have very large data sets,
>>> process heap-based data will have an increasing hidden
>>> cost in GC sweep/copy time, even though access times may
>>> look excellent in benchmarks.
>>>
>>> Running benchmarks for longer periods should allow you
>>> to include amortized GC cost in the results, but you
>>> will also get a bunch of other noise. Tracing on GC
>>> events with timestamps may be a more accurate method.
>>>
>>> BR,
>>> Ulf
>>>
>>> ________________________________________________________________
>>> erlang-questions (at) erlang.org mailing list.
>>> See http://www.erlang.org/faq.html
>>> To unsubscribe; mailto:
>>>
>>
>> ________________________________________________________________
>> erlang-questions (at) erlang.org mailing list.
>> See http://www.erlang.org/faq.html
>> To unsubscribe; mailto:
>>
>>
> ________________________________________________________________
> erlang-questions (at) erlang.org mailing list.
> See http://www.erlang.org/faq.html
> To unsubscribe; mailto:
>



More information about the erlang-questions mailing list