• DocumentCode
    3363429
  • Title

    A fast and versatile path index for querying semi-structured data

  • Author

    Barg, Michael ; Wong, Raymond K.

  • Author_Institution
    Sch. of Comput. Sci. & Eng., New South Wales Univ., Sydney, NSW, Australia
  • fYear
    2003
  • fDate
    26-28 March 2003
  • Firstpage
    249
  • Lastpage
    256
  • Abstract
    The richness of semi-structured data allows data of varied and inconsistent structures to be stored in a single database. Such data can be represented as a graph, and queries can be constructed using path expressions, which describe traversals through the graph. Instead of providing optimal performance for a limited range of path expressions, we propose a mechanism which is shown to have consistent and high performance for path expressions of any complexity, including those with descendant operators (path wildcards). We further detail mechanisms which employ our index to perform more complex processing, such as evaluating both path expressions containing links and entire (sub) queries containing path based predicates. Performance is shown to be independent of the number of terms in the path expression, even where these contain wildcards. Experiments show that our index is faster than conventional methods by up to two orders of magnitude for certain query types, is small, and scales well.
  • Keywords
    graph theory; hypermedia markup languages; query processing; tree data structures; descendant operators; path based predicates; path expressions; path index; query types; querying; semi-structured data queries; wildcards; Computer science; Data engineering; Databases; Degradation; Encoding; Indexes; Indexing; Motion pictures; Performance evaluation; Query processing;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Database Systems for Advanced Applications, 2003. (DASFAA 2003). Proceedings. Eighth International Conference on
  • Conference_Location
    Kyoto, Japan
  • Print_ISBN
    0-7695-1895-8
  • Type

    conf

  • DOI
    10.1109/DASFAA.2003.1192389
  • Filename
    1192389