DocumentCode
2830896
Title
Path Tree: Document Synopsis for XPath Query Selectivity Estimation
Author
Alrammal, Muath ; Hains, Gaétan ; Zergaoui, Mohamed
Author_Institution
Innovimax SARL, Paris, France
fYear
2011
fDate
June 30 2011-July 2 2011
Firstpage
321
Lastpage
328
Abstract
XML is one of the most important standards for manipulating data on the Internet. However, querying large volumes of XML data represents a bottleneck for several computationally intensive applications. A solution is to pre-process the document in streaming mode with resources approximately proportional to document depth and query selectivity. Limited processing space can then accommodate much larger documents. But the actual savings vary so much as to make them unpredictable. To overcome this limitation of stream-processing we propose a new application of the path tree synopsis data structure. Such a synopsis provides a succinct description of the original document with low computational overhead and high accuracy for processing tasks like selectivity estimation and query answer approximation. In this paper, we formally define the path tree synopsis, informally introduced by and used by, and propose a new streaming algorithm to construct it. We also present an online stream-querying system able to estimate the cost for a given query before answering it accurately. The core algorithm is adapted from LQ, we apply it to path tree traversal, cost estimation, query processing and even optimizations.
Keywords
Internet; XML; computational complexity; query processing; tree data structures; Internet; XML; XPath query selectivity estimation; computational overhead; data manipulation; document preprocessing; online stream querying system; path tree synopsis data structure; query answer approximation; query optimization; stream processing; streaming algorithm; Accuracy; Data structures; Doped fiber amplifiers; Estimation; Q measurement; Query processing; XML;
fLanguage
English
Publisher
ieee
Conference_Titel
Complex, Intelligent and Software Intensive Systems (CISIS), 2011 International Conference on
Conference_Location
Seoul
Print_ISBN
978-1-61284-709-2
Electronic_ISBN
978-0-7695-4373-4
Type
conf
DOI
10.1109/CISIS.2011.53
Filename
5989033
Link To Document