Title :
A general and efficient algorithm for “top” queries
Author_Institution :
Hewlett-Packard Labs., Palo Alto, CA
Abstract :
"Top K" queries reduce a query result to the most interesting or the most urgent items. In many cases, e.g., when the result size is unbounded due to duplicate key values, a "top" operation cannot be implemented using the standard algorithm based on an in-memory priority queue. The default alternative is a full sort. External merge sort admits multiple novel optimizations specific to "top" operations. These are simple to implement yet greatly reduce the data volume written to runs on temporary storage. Experiments demonstrate substantial performance improvements, in one case exceeding three orders of magnitude.
Keywords :
data structures; merging; query processing; sorting; duplicate key values; in-memory priority queue; merge sort; top K query optimizations; Computer aided software engineering; Data mining; Gold; Laboratories; Medals; Memory management; Query processing; Robustness; Scalability; Sorting;
Conference_Titel :
Data Engineering Workshop, 2008. ICDEW 2008. IEEE 24th International Conference on
Conference_Location :
Cancun
Print_ISBN :
978-1-4244-2161-9
Electronic_ISBN :
978-1-4244-2162-6
DOI :
10.1109/ICDEW.2008.4498379