• DocumentCode
    3322951
  • Title

    The Space Complexity of Processing XML Twig Queries Over Indexed Documents

  • Author

    Shalem, Mirit ; Bar-Yossef, Ziv

  • Author_Institution
    Dept. of Comput. Sci., Technion-Israel Inst. of Technol., Haifa
  • fYear
    2008
  • fDate
    7-12 April 2008
  • Firstpage
    824
  • Lastpage
    832
  • Abstract
    Current twig join algorithms incur high memory costs on queries that involve child-axis nodes. In this paper we provide an analytical explanation for this phenomenon. In a first large-scale study of the space complexity of evaluating XPath queries over indexed XML documents we show the space to depend on three factors: (1) whether the query is a path or a tree; (2) the types of axes occurring in the query and their occurrence pattern; and (3) the mode of query evaluation (filtering, full- fledged, or "pattern matching"). Our lower bounds imply that evaluation of a large class of queries that have child-axis nodes indeed requires large space. Our study also reveals that on some queries there is a large gap between the space needed for pattern matching and the space needed for full-fledged evaluation or filtering. This implies that many existing twig join algorithms, which work in the pattern matching mode, incur significant space overhead. We present a new twig join algorithm that avoids this overhead. On certain queries our algorithm is exceedingly more space-efficient than existing algorithms, sometimes bringing the space down from linear in the document size to constant.
  • Keywords
    XML; computational complexity; indexing; query processing; XML twig queries processing; XPath queries; child-axis nodes; indexed documents; query evaluation; space complexity; Computer science; Costs; Encoding; Filtering; Large-scale systems; Matched filters; Pattern matching; Query processing; Relational databases; XML;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Engineering, 2008. ICDE 2008. IEEE 24th International Conference on
  • Conference_Location
    Cancun
  • Print_ISBN
    978-1-4244-1836-7
  • Electronic_ISBN
    978-1-4244-1837-4
  • Type

    conf

  • DOI
    10.1109/ICDE.2008.4497491
  • Filename
    4497491