Exercise: ETS with secondary indices
Roberto Ostinelli
ostinelli@REDACTED
Tue Dec 10 20:26:20 CET 2019
Related, is there any difference in terms of performance in this case (
ordered_set) between:
ets:select(groups, [{
{{Name, '$2'}, '_', '_', '_'},
[],
['$2']
}])
and
ets:select(groups, [{
{{$1, '$2'}, '_', '_', '_'},
[{'=:=', '$1', Name}],
['$2']
}])
On Tue, Dec 10, 2019 at 12:34 PM Roberto Ostinelli <ostinelli@REDACTED>
wrote:
> Sverker,
> This looks very nice. I was considering using composite keys but I didn't
> know that partial lookups would work.
>
> Seems definitely cleaner, and if I understand correctly the lookup time
> will be the same as with a single key, with the added benefit of the delete
> functionalities (please correct me if I'm wrong).
>
> Thank you so much,
> r.
>
>
> On Mon, Dec 9, 2019 at 8:15 PM Sverker Eriksson <sverker@REDACTED>
> wrote:
>
>> Here is a variant of your second solution. But instead insert each
>> element-group pair as a *key*
>> and use ordered_set which has an optimization for *traversal with
>> partially bound key*.
>>
>>
>> ets:new(elements,[ordered_set,named_table]).
>> ets:new(groups,[ordered_set,named_table]).
>>
>> % Insert element
>> [ets:insert(elements, {{Element,G}}) || G <- Groups],
>> [ets:insert(groups, {{G,Element}}) || G <- Groups],
>>
>>
>> % Lookup groups from element
>> Groups = ets:select(elements, [{{{Element,'$1'}}, [], ['$1']}]),
>>
>> % Lookup elements from group
>> Elements = ets:select(groups, [{{{Group,'$1'}}, [], ['$1']}]),
>>
>> % Delete element
>> Groups = ets:select(elements, [{{{Element,'$1'}}, [], ['$1']}]),
>> ets:match_delete(elements, {{Element, '_'}}),
>> [ets:delete(groups, {G,Element}) || G <- Groups],
>>
>>
>> All select and match traversals use a partially bound key and will
>> only search the part of the ordered_set (tree) that may contain such keys.
>>
>>
>> /Sverker
>>
>> On lör, 2019-12-07 at 18:54 +0100, Roberto Ostinelli wrote:
>>
>> All,
>> I'm doing a conceptual exercise to evaluate the usage of ETS with
>> secondary indices.
>>
>> The exercise is really simple:
>>
>> 1. There are elements which can belong to groups, in a m-to-n
>> relationship (every element can belong to one or more groups).
>> 2. Elements and groups can be created/deleted at will.
>> 3. We need to find the groups an element belongs to.
>> 4. We need to find the elements included into a group.
>> 5. We need to not keep an element if it does not belong to a group,
>> nor a group if it does not have elements in it (this is to avoid memory
>> leaks since groups / elements get added / removed).
>>
>> All of these operations should be optimized as much as possible.
>>
>>
>> A. One possibility is to have 2 ETS tables of type set, that contain the
>> following terms:
>>
>> 1. table elements with tuple format {Element, Groups}
>> 2. table groups with tuple format {Group, Elements}
>>
>> Groups and Elements would be maps (so that finding an element of the map
>> does not require traversing an array).
>>
>> Adding an Element to a Group would mean:
>>
>> - An insert in the table elements where the Group gets added to the
>> Groups map (bonus of using a map: no duplicates can be created).
>> - An insert in the table groups where the Element gets added to the
>> Elements map.
>>
>> Retrieving the groups an element belongs to is simple as getting the
>> Element in the table elements, the groups will be keys of the Groups map.
>> Retrieving the elements a group contains to is simple as getting the
>> Group in the table groups, the elements will be keys of the Elements map.
>>
>> Deleting is far from being optimized though. For instance, deleting an
>> Element means:
>>
>> - Getting the Groups it belongs to with a lookup in the table elements
>> .
>> - For every Group in the Groups map, remove the Element from the
>> Elements map.
>>
>> It becomes something like this:
>>
>> case ets:lookup(elements, Element) of
>> [{Element, Groups}] ->
>> ets:delete(elements, Element),
>> lists:foreach(fun({Group}) ->
>> case ets:lookup(groups, Group) of
>> [{Group, Elements}] ->
>> case maps:remove(Element, Elements) of
>> Elements1 when map_size(Elements1) == 0 ->
>> ets:delete(groups, Group);
>> Elements1 ->
>> ets:insert(groups, {Group, Elements1})
>> end;
>> _ ->
>> ok
>> end
>> end, maps:keys(Groups))
>> _ ->
>> ok
>> end.
>>
>> Ouch. Same goes for groups. And this is to delete *one* element, imagine
>> if bigger operations need to be made.
>>
>>
>> B. Another possibility would be to have 2 ETS tables of type bag, that
>> contain the following terms:
>>
>> 1. table elements with tuple format {Element, Group}
>> 2. table groups with tuple format {Group, Element}
>>
>> Adding an Element to a Group means:
>>
>> - An insert in the table elements of the tuple {Element, Group}.
>> - An insert in the table groups of the tuple {Group, Element}.
>>
>> Retrieving the groups an element belongs to requires an
>> ets:match_object/2 or similar such as:
>> ets:match_object(elements, {Element, _ = '_'}).
>>
>> Retrieving the elements a group belongs to requires an ets:match_object/2
>> or similar such as:
>> ets:match_object(groups, {Group, _ = '_'}).
>>
>> So retrieving requires the traversing of a table, but it should be
>> relatively quick, given that the match is done on the index itself.
>>
>> Deleting is not particularly optimized though, because it requires the
>> traversing of a table with elements that are not indexed. For instance,
>> deleting an Element would look something like:
>> ets:match_delete(elements, {Element, _ = '_'}). %% fast, indexed
>> ets:match_delete(groups, {_ = '_', Element}). %% not fast, table is being
>> traversed.
>>
>>
>> What would *you* do? Any takers are welcome ^^_
>>
>> Cheers,
>> r.
>>
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20191210/fc33eec9/attachment.htm>
More information about the erlang-questions
mailing list