Exercise: ETS with secondary indices

Tue Dec 10 20:26:20 CET 2019

Related, is there any difference in terms of performance in this case (
ordered_set) between:

ets:select(groups, [{
    {{Name, '$2'}, '_', '_', '_'},
    [],
    ['$2']
}])

and

ets:select(groups, [{
    {{$1, '$2'}, '_', '_', '_'},
    [{'=:=', '$1', Name}],
    ['$2']
}])

On Tue, Dec 10, 2019 at 12:34 PM Roberto Ostinelli <ostinelli@REDACTED>
wrote:

> Sverker,
> This looks very nice. I was considering using composite keys but I didn't
> know that partial lookups would work.
>
> Seems definitely cleaner, and if I understand correctly the lookup time
> will be the same as with a single key, with the added benefit of the delete
> functionalities (please correct me if I'm wrong).
>
> Thank you so much,
> r.
>
>
> On Mon, Dec 9, 2019 at 8:15 PM Sverker Eriksson <sverker@REDACTED>
> wrote:
>
>> Here is a variant of your second solution. But instead insert each
>> element-group pair as a *key*
>> and use ordered_set which has an optimization for *traversal with
>> partially bound key*.
>>
>>
>> ets:new(elements,[ordered_set,named_table]).
>> ets:new(groups,[ordered_set,named_table]).
>>
>> % Insert element
>> [ets:insert(elements, {{Element,G}}) || G <- Groups],
>> [ets:insert(groups, {{G,Element}}) || G <- Groups],
>>
>>
>> % Lookup groups from element
>> Groups = ets:select(elements, [{{{Element,'$1'}}, [], ['$1']}]),
>>
>> % Lookup elements from group
>> Elements = ets:select(groups, [{{{Group,'$1'}}, [], ['$1']}]),
>>
>> % Delete element
>> Groups = ets:select(elements, [{{{Element,'$1'}}, [], ['$1']}]),
>> ets:match_delete(elements, {{Element, '_'}}),
>> [ets:delete(groups, {G,Element}) || G <- Groups],
>>
>>
>> All select and match traversals use a partially bound key and will
>> only search the part of the ordered_set (tree) that may contain such keys.
>>
>>
>> /Sverker
>>
>> On lör, 2019-12-07 at 18:54 +0100, Roberto Ostinelli wrote:
>>
>> All,
>> I'm doing a conceptual exercise to evaluate the usage of ETS with
>> secondary indices.
>>
>> The exercise is really simple:
>>
>>    1. There are elements which can belong to groups, in a m-to-n
>>    relationship (every element can belong to one or more groups).
>>    2. Elements and groups can be created/deleted at will.
>>    3. We need to find the groups an element belongs to.
>>    4. We need to find the elements included into a group.
>>    5. We need to not keep an element if it does not belong to a group,
>>    nor a group if it does not have elements in it (this is to avoid memory
>>    leaks since groups / elements get added / removed).
>>
>> All of these operations should be optimized as much as possible.
>>
>>
>> A. One possibility is to have 2 ETS tables of type set, that contain the
>> following terms:
>>
>> 1. table elements with tuple format {Element, Groups}
>> 2. table groups with tuple format {Group, Elements}
>>
>> Groups and Elements would be maps (so that finding an element of the map
>> does not require traversing an array).
>>
>> Adding an Element to a Group would mean:
>>
>>    - An insert in the table elements where the Group gets added to the
>>    Groups map (bonus of using a map: no duplicates can be created).
>>    - An insert in the table groups where the Element gets added to the
>>    Elements map.
>>
>> Retrieving the groups an element belongs to is simple as getting the
>> Element in the table elements, the groups will be keys of the Groups map.
>> Retrieving the elements a group contains to is simple as getting the
>> Group in the table groups, the elements will be keys of the Elements map.
>>
>> Deleting is far from being optimized though. For instance, deleting an
>> Element means:
>>
>>    - Getting the Groups it belongs to with a lookup in the table elements
>>    .
>>    - For every Group in the Groups map, remove the Element from the
>>    Elements map.
>>
>> It becomes something like this:
>>
>> case ets:lookup(elements, Element) of
>>     [{Element, Groups}] ->
>>         ets:delete(elements, Element),
>>         lists:foreach(fun({Group}) ->
>>             case ets:lookup(groups, Group) of
>>                 [{Group, Elements}] ->
>>                     case maps:remove(Element, Elements) of
>>                         Elements1 when map_size(Elements1) == 0 ->
>>                             ets:delete(groups, Group);
>>                         Elements1 ->
>>                             ets:insert(groups, {Group, Elements1})
>>                     end;
>>                 _ ->
>>                     ok
>>             end
>>         end, maps:keys(Groups))
>>     _ ->
>>         ok
>> end.
>>
>> Ouch. Same goes for groups. And this is to delete *one* element, imagine
>> if bigger operations need to be made.
>>
>>
>> B. Another possibility would be to have 2 ETS tables of type bag, that
>> contain the following terms:
>>
>> 1. table elements with tuple format {Element, Group}
>> 2. table groups with tuple format {Group, Element}
>>
>> Adding an Element to a Group means:
>>
>>    - An insert in the table elements of the tuple {Element, Group}.
>>    - An insert in the table groups of the tuple {Group, Element}.
>>
>> Retrieving the groups an element belongs to requires an
>> ets:match_object/2 or similar such as:
>> ets:match_object(elements, {Element, _ = '_'}).
>>
>> Retrieving the elements a group belongs to requires an ets:match_object/2
>> or similar such as:
>> ets:match_object(groups, {Group, _ = '_'}).
>>
>> So retrieving requires the traversing of a table, but it should be
>> relatively quick, given that the match is done on the index itself.
>>
>> Deleting is not particularly optimized though, because it requires the
>> traversing of a table with elements that are not indexed. For instance,
>> deleting an Element would look something like:
>> ets:match_delete(elements, {Element, _ = '_'}). %% fast, indexed
>> ets:match_delete(groups, {_ = '_', Element}). %% not fast, table is being
>> traversed.
>>
>>
>> What would *you* do? Any takers are welcome ^^_
>>
>> Cheers,
>> r.
>>
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20191210/fc33eec9/attachment.htm>