[erlang-questions] fast tuple sorting

Dmitry Kolesnikov dmkolesnikov@REDACTED
Thu Jun 12 12:14:13 CEST 2014


Hello,

Thank you very much for your advises. 
Let’s re-start the discussion on this thread. I was unavailable recently but problem is still exists.

So, let me depict the task and my temporary solution for it but I’d like to obtain an opinion on other possible solutions.

There is a flow of request to the system. Each request is set of integer on interval [0, 2^32]. The set cardinality varies from 100 to 300 elements. Some of those integers has a special meaning, they are associated with list of tuples in format {integer(), float()}. The length of tuple lists over 200 elements and there are finite number of those key integers, about ~6M. The system filters “key integers” from incoming request. For each "key integer" it retrieves a list of tuples associated with it and returns this list to client. The list of tuples has to be sorted by first element. It is very typical scenario that resulting list is 30k - 50k tuples. As I said previously, the timing criteria is hard 6ms to sort the list of tuples.
Profiling says that sorting is bottleneck.      

What comes to implementation… 

The associated lists are stored within ets bag. Each ets:lookup operation return me a sub-list of tuples. What I need is to sort them. 

I’ve tries some of your advices:

* gb_trees / rbtrees the build time goes over 100ms
* ordered ets was fast enough to build but build + read + delete cycle does not fit to the budget
* binomial heap was fast to append but slow to read (normal heap suffered from building)   

I am afraid that problem is bond to data copy.  

The best possible option I came out is
 * sort tuple lists before they written to ets bag
 * use merge sort to build resulting list.

this options has nearly same performance as using ordered ets but it is not capable to sort 30k items in given time frame, it takes about 15ms. Fortunately, I was able to temporary reduce the list size to 6 - 8k result elements. The merge sort producing the output with 6 - 7ms. However, I am still seeking for full solution.

Best Regards, 
Dmitry


On 30 May 2014, at 22:14, Max Lapshin <max.lapshin@REDACTED> wrote:

> Dmitry, maybe you can tell a bit about your task?
> 
> What are these tuples? Where do you get them from?
> 




More information about the erlang-questions mailing list