• DocumentCode
    2830896
  • Title

    Path Tree: Document Synopsis for XPath Query Selectivity Estimation

  • Author

    Alrammal, Muath ; Hains, Gaétan ; Zergaoui, Mohamed

  • Author_Institution
    Innovimax SARL, Paris, France
  • fYear
    2011
  • fDate
    June 30 2011-July 2 2011
  • Firstpage
    321
  • Lastpage
    328
  • Abstract
    XML is one of the most important standards for manipulating data on the Internet. However, querying large volumes of XML data represents a bottleneck for several computationally intensive applications. A solution is to pre-process the document in streaming mode with resources approximately proportional to document depth and query selectivity. Limited processing space can then accommodate much larger documents. But the actual savings vary so much as to make them unpredictable. To overcome this limitation of stream-processing we propose a new application of the path tree synopsis data structure. Such a synopsis provides a succinct description of the original document with low computational overhead and high accuracy for processing tasks like selectivity estimation and query answer approximation. In this paper, we formally define the path tree synopsis, informally introduced by and used by, and propose a new streaming algorithm to construct it. We also present an online stream-querying system able to estimate the cost for a given query before answering it accurately. The core algorithm is adapted from LQ, we apply it to path tree traversal, cost estimation, query processing and even optimizations.
  • Keywords
    Internet; XML; computational complexity; query processing; tree data structures; Internet; XML; XPath query selectivity estimation; computational overhead; data manipulation; document preprocessing; online stream querying system; path tree synopsis data structure; query answer approximation; query optimization; stream processing; streaming algorithm; Accuracy; Data structures; Doped fiber amplifiers; Estimation; Q measurement; Query processing; XML;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Complex, Intelligent and Software Intensive Systems (CISIS), 2011 International Conference on
  • Conference_Location
    Seoul
  • Print_ISBN
    978-1-61284-709-2
  • Electronic_ISBN
    978-0-7695-4373-4
  • Type

    conf

  • DOI
    10.1109/CISIS.2011.53
  • Filename
    5989033