Title :
A comparative study of spatial indexing techniques for multidimensional scientific datasets
Author :
Nam, Beomseok ; Sussman, Alan
Author_Institution :
Dept. of Comput. Sci., Maryland Univ., College Park, MD, USA
Abstract :
Scientific applications that query into very large multidimensional datasets are becoming more common. These datasets are growing in size every day, and are becoming truly enormous, making it infeasible to index individual data elements. We have instead been experimenting with chunking the datasets to index them, grouping data elements into small chunks of a fixed, but dataset-specific, size to take advantage of spatial locality. While spatial indexing structures based on R-trees perform reasonably well for the rectangular bounding boxes of such chunked datasets, other indexing structures based on KDB-trees, such as Hybrid trees, have been shown to perform very well for point data. In this paper, we investigate how all these indexing structures perform for multidimensional scientific datasets, and compare their features and performance with that of SH-trees, an extension of Hybrid trees, for indexing multidimensional rectangles. Our experimental results show that the algorithms for building and searching SH-trees outperform those for R-trees, R*-trees, and X-trees for both real application and synthetic datasets and queries. We show that the SH-tree algorithms perform well for both low and high dimensional data, and that they scale well to high dimensions both for building and searching the trees.
Keywords :
database indexing; database theory; query processing; scientific information systems; spatial data structures; tree data structures; tree searching; very large databases; visual databases; Hybrid trees; KDB-trees; R*-trees; R-trees; SH-trees building; SH-trees searching; X-trees; application datasets; chunked datasets; data element grouping; data element indexing; multidimensional dataset querying; multidimensional rectangle indexing; multidimensional scientific datasets; rectangular bounding boxes; scientific applications; spatial indexing structures; spatial locality; synthetic datasets; Application software; Buildings; Computer science; Educational institutions; Indexing; Laboratories; Multidimensional systems; NASA; Nearest neighbor searches; Subcontracting;
Conference_Titel :
Scientific and Statistical Database Management, 2004. Proceedings. 16th International Conference on
Print_ISBN :
0-7695-2146-0
DOI :
10.1109/SSDM.2004.1311209