مرکز منطقه ای اطلاع رساني علوم و فناوري - Multiresolution indexing of XML for frequent queries

DocumentCode :

3268977

Title :

Multiresolution indexing of XML for frequent queries

Author :

He, Hao ; Yang, Jun

Author_Institution :

Dept. of Comput. Sci., Duke Univ., Durham, NC, USA

fYear :

2004

fDate :

30 March-2 April 2004

Firstpage :

683

Lastpage :

694

Abstract :

XML and other types of semistructured data are typically represented by a labeled directed graph. To speed up path expression queries over the graph, a variety of structural indexes have been proposed. They usually work by partitioning nodes in the data graph into equivalence classes and storing equivalence classes as index nodes. A(k)-index introduces the concept of local bisimilarity for partitioning, allowing the trade-off between index size and query answering power. However, all index nodes in A(k)-index have the same local similarity k, which cannot take advantage of the fact that a workload may contain path expressions of different lengths, or that different parts of the data graph may have different local similarity requirements. To overcome these limitations, we propose M(k)- and M*(k)-indexes. The basic M(k)-index is workload-aware: Like the previously proposed D(k)-index, it allows different index nodes to have different local similarity requirements, providing finer partitioning only for parts of the data graph targeted by longer path expressions. Unlike D(k)-index, M(k)-index is never over-refined for irrelevant index or data nodes. However, the workload-aware feature still incurs overrefinement due to over-qualified parent index nodes. Moreover, fine partitions penalize the performance of short path expressions. To solve these problems, we further propose the M*(k)-index. An M*(k)-index consists of a collection of indexes whose nodes are organized in a partition hierarchy, allowing successively coarser partitioning information to co-exist with the finest partitioning information required. Experiments show that our indexes are superior to previously proposed indexes in terms of index size and query performance.

Keywords :

XML; bisimulation equivalence; data structures; database indexing; directed graphs; equivalence classes; query processing; A(k)-index; M(k)-index; XML; data graph; equivalence classes; expression queries; frequent queries; labeled directed graph; local bisimilarity; multiresolution indexing; over-qualified parent index node; partitioning nodes; path expression; query answering power; semistructured data; short path expression; structural index; workload-aware feature; Computer science; Data engineering; Data models; Database languages; Engineering profession; Helium; Indexing; Internet; Query processing; XML;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Data Engineering, 2004. Proceedings. 20th International Conference on

ISSN :

1063-6382

Print_ISBN :

0-7695-2065-0

Type :

conf

DOI :

10.1109/ICDE.2004.1320037

Filename :

1320037

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3268977