Title :
Performance of KDB-trees with query-based splitting
Author :
Lépouchard, Yves ; Orlandic, Ratko ; Pfaltz, John L.
Author_Institution :
Dept. of Comput. Sci., Virginia Univ., Charlottesville, VA, USA
Abstract :
While the persistent data of many advanced database applications, such as OLAP and scientific studies, are characterized by very high dimensionality, typical queries posed on these data appeal to a small number of relevant dimensions. Unfortunately, the multidimensional access methods designed for high-dimensional data perform rather poorly for these partially specified queries. A potentially very appealing idea, frequently suggested in the literature, is to adopt a node-splitting policy that takes into account the "importance" of individual dimensions, which could be determined either a priori or through a statistical sampling of actual queries. This paper presents the results of some carefully controlled experiments conducted to observe the effects of query-based splitting on the performance of KDB-trees. The strategy is compared to a splitting policy that selects the split dimensions in a "cyclic" fashion, which has been shown to be very effective, especially in high-dimensional situations. Based on the results, the query-based splitting does not appear to be a very appealing splitting strategy for KDB-trees.
Keywords :
query processing; software performance evaluation; tree data structures; very large databases; KDB-trees; OLAP; advanced database applications; experiments; high-dimensional data; multidimensional access methods; node-splitting policy; persistent data; query-based splitting; splitting strategy; statistical sampling; Application software; Computer science; Databases; Design methodology; Information retrieval; Multidimensional systems; Physics; Sampling methods; Tree data structures; US Department of Energy;
Conference_Titel :
Information Technology: Coding and Computing, 2002. Proceedings. International Conference on
Print_ISBN :
0-7695-1506-1
DOI :
10.1109/ITCC.2002.1000390