DocumentCode :
2830896
Title :
Path Tree: Document Synopsis for XPath Query Selectivity Estimation
Author :
Alrammal, Muath ; Hains, Gaétan ; Zergaoui, Mohamed
Author_Institution :
Innovimax SARL, Paris, France
fYear :
2011
fDate :
June 30 2011-July 2 2011
Firstpage :
321
Lastpage :
328
Abstract :
XML is one of the most important standards for manipulating data on the Internet. However, querying large volumes of XML data represents a bottleneck for several computationally intensive applications. A solution is to pre-process the document in streaming mode with resources approximately proportional to document depth and query selectivity. Limited processing space can then accommodate much larger documents. But the actual savings vary so much as to make them unpredictable. To overcome this limitation of stream-processing we propose a new application of the path tree synopsis data structure. Such a synopsis provides a succinct description of the original document with low computational overhead and high accuracy for processing tasks like selectivity estimation and query answer approximation. In this paper, we formally define the path tree synopsis, informally introduced by and used by, and propose a new streaming algorithm to construct it. We also present an online stream-querying system able to estimate the cost for a given query before answering it accurately. The core algorithm is adapted from LQ, we apply it to path tree traversal, cost estimation, query processing and even optimizations.
Keywords :
Internet; XML; computational complexity; query processing; tree data structures; Internet; XML; XPath query selectivity estimation; computational overhead; data manipulation; document preprocessing; online stream querying system; path tree synopsis data structure; query answer approximation; query optimization; stream processing; streaming algorithm; Accuracy; Data structures; Doped fiber amplifiers; Estimation; Q measurement; Query processing; XML;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Complex, Intelligent and Software Intensive Systems (CISIS), 2011 International Conference on
Conference_Location :
Seoul
Print_ISBN :
978-1-61284-709-2
Electronic_ISBN :
978-0-7695-4373-4
Type :
conf
DOI :
10.1109/CISIS.2011.53
Filename :
5989033
Link To Document :
بازگشت