Title :
On the sequencing of tree structures for XML indexing
Author :
Wang, Haixun ; Meng, Xiaofeng
Author_Institution :
IBM Thomas J. Watson Res. Center, NY, USA
Abstract :
Sequence-based XML indexing aims at avoiding expensive join operations in query processing. It transforms structured XML data into sequences so that a structured query can be answered holistically through subsequence matching. In this paper, we address the problem of query equivalence with respect to this transformation, and we introduce a performance-oriented principle for sequencing tree structures. With query equivalence, XML queries can be performed through subsequence matching without join operations, post-processing, or other special handling for problems such as false alarms. We identify a class of sequencing methods for this purpose, and we present a novel subsequence matching algorithm that observe query equivalence. Still, query equivalence is just a prerequisite for sequence-based XML indexing. Our goal is to find the best sequencing strategy with regard to the time and space complexity in indexing and querying XML data. To this end, we introduce a performance-oriented principle to guide the sequencing of tree structures. For any given XML data set, the principle finds an optimal sequencing strategy according to its schema and its data distribution. We present a novel method that realizes this principle. In our experiments, we show the advantages of sequence-based indexing over traditional XML indexing methods, and we compare several sequencing strategies and demonstrate the benefit of the performance-oriented sequencing principle.
Keywords :
XML; database indexing; query languages; query processing; tree data structures; XML data querying; XML query processing; join operation; query equivalence; sequence-based XML indexing; space complexity; structured query; subsequence matching algorithm; time complexity; tree structure sequencing; Data mining; Database languages; Indexing; Intrusion detection; Neural networks; Project management; Query processing; Research and development management; Tree data structures; XML;
Conference_Titel :
Data Engineering, 2005. ICDE 2005. Proceedings. 21st International Conference on
Print_ISBN :
0-7695-2285-8
DOI :
10.1109/ICDE.2005.98