• DocumentCode
    3363456
  • Title

    PathGuide: an efficient clustering based indexing method for XML path expressions

  • Author

    Cheng, Jiefeng ; Ge Yu ; Wang, Guoren ; Yu, Ge

  • Author_Institution
    Dept. of Comput. Sci. & Eng., Northeastern Univ., Shenyang, China
  • fYear
    2003
  • fDate
    26-28 March 2003
  • Firstpage
    257
  • Lastpage
    264
  • Abstract
    This paper focuses on the performance improvement for long-path XML query processing. It is motivated by the fact that the existing inverted index and join algorithms are efficient for short path XML queries, but are inefficient for long path XML queries since the response time of the existing approaches is exponential to the length of paths. We propose a clustering based indexing method, called PathGuide, in this paper, which enhances the XML inverted index with the clustering technique. The element nodes are clustered based on their path patterns and the summary for such path information is kept in a suffix tree as the index of these element nodes. In addition, new operations are proposed to fully utilize PathGuide. With the assistance of PathGuide, unlike the path expansion approach used in Lore, the set of a relative location path can be found via one-step index lookup. Compared to the existing structural join method, PathGuide significantly reduces both join overhead and disk I/O cost. The extensive experimental studies are conducted and our results show that PathGuide outperforms the structural joins at least four times in most cases.
  • Keywords
    hypermedia markup languages; indexing; pattern clustering; query processing; Lore; PathGuide; XML inverted index; XML path expressions; clustering technique; efficient clustering based indexing method; inverted index; join overhead; long-path XML query processing; one-step index lookup; path expansion approach; relative location path; short path XML queries; structural join method; Clustering algorithms; Costs; Database languages; Delay; Indexing; Navigation; Proposals; Query processing; Tree data structures; XML;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Database Systems for Advanced Applications, 2003. (DASFAA 2003). Proceedings. Eighth International Conference on
  • Conference_Location
    Kyoto, Japan
  • Print_ISBN
    0-7695-1895-8
  • Type

    conf

  • DOI
    10.1109/DASFAA.2003.1192390
  • Filename
    1192390