[erlang-questions] ets:match_object/3 strangeness

Sverker Eriksson sverker.eriksson@REDACTED
Tue Sep 15 15:18:32 CEST 2015

ETS 'bag' and 'duplicate_bag' allow key duplicates,
but are not at all optimized for large number of duplicates.

Both hashing and iteration is only based on keys. All key duplicates will
thus end up in the same hash bucket as a linked list. All iterations 
have to return all duplicates in one call as there is no way to 
yield/resume in the middle of
a cluster of duplicates. That is why the continuation term contains the 
rest of them.

The ordering strangeness seems more like a plain bug.

/Sverker, Erlang/OTP Ericsson

On 09/15/2015 02:33 AM, Robert Virding wrote:
> I have an ETS table called *sune* which is a duplicate_bag containing 
> many elements with the same key. The *bert* elements have the 
> structure *{bert,SequenceNo,Data}* so I can see the ordering. If I do 
> *ets:match_object(sune, {bert,'_','_'})* I will get back a list of all 
> the matching elements in the order they were inserted, which is what I 
> want. However, there can be quite a few elements with the same key so 
> I would prefer to get them in chunks instead of all at once. The 
> solution for this is to start with *ets:match_object(sune, 
> {bert,'_','_'}, 5)* which returns a continuation and then follow with 
> a sequence of *ets:match_object(Continuation)* until I get all the 
> elements in chunks of 5. That is when things start to get strange:
> - Now the ordering is completely messed up. Each chunk of 5 contains a 
> sequence but some in ascending order and some in descending order, for 
> example 440,439,438,437,436.
> - The chunks come in what looks like random order, for example chunks 
> starting with 571,425,566,430 but starting with the outermost first 
> working towards the middle.
> - The really weird thing is that the continuation actually contains 
> all the matching elements which haven't been returned yet.
> The last is the really strange. According to the documentation for 
> match_object/3:
> "Works like ets:match_object/2 but only returns a limited (Limit) 
> number of matching objects. The Continuation term can then be used in 
> subsequent calls to ets:match_object/1 to get the next chunk of 
> matching objects. This is a space efficient way to work on objects in 
> a table which is still faster than traversing the table object by 
> object using ets:first/1 and ets:next/1."
> But the continuation contains them all so where is the space 
> efficiency? I mean why bother to use this at all as it gives me no 
> benefits.
> I tried the same thing using *ets:select/3/1* and got the same result.
> My original goal was to work how this would interact with inserting or 
> deleting elements while I was scanning. My got my answer.
> Robert
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://erlang.org/mailman/listinfo/erlang-questions

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20150915/779969fb/attachment.htm>

More information about the erlang-questions mailing list