• DocumentCode
    3012836
  • Title

    Querying XML documents made easy: nearest concept queries

  • Author

    Schmidt, Albrecht ; Kersten, Martin ; Windhouwer, Menzo

  • Author_Institution
    CWI, Amsterdam, Netherlands
  • fYear
    2001
  • fDate
    2001
  • Firstpage
    321
  • Lastpage
    329
  • Abstract
    Due to the ubiquity and popularity of XML, users often are in the following situation: they want to query XML documents which contain potentially interesting information but they are unaware of the mark-up structure that is used. For example, it is easy to guess the contents of an XML bibliography file whereas the mark-up depends on the methodological, cultural and personal background of the author(s). None the less, it is this hierarchical structure that forms the basis of XML query languages. We exploit the tree structure of XML documents to equip users with a powerful tool, the meet operator that lets them query databases with whose content they are familiar, but without requiring knowledge of tags and hierarchies. Our approach is based on computing the lowest common ancestor of nodes in the XML syntax tree: e.g., given two strings, we are looking for nodes whose offspring contains these two strings. The novelty of this approach is that the result type is unknown at query formulation time and dependent on the database instance. If the two strings are an author´s name and a year mainly publications of the author in this year are returned. If the two strings are numbers the result mostly consists of publications that have the numbers as year or page numbers. Because the result type of a query is not specified by the user we refer to the lowest common ancestor as nearest concept. We also present a running example taken from the bibliography domain, and demonstrate that the operator can be implemented efficiently
  • Keywords
    data models; hypermedia markup languages; multimedia databases; query languages; query processing; tree data structures; XML bibliography file; XML document querying; mark-up structure; nearest concept queries; query formulation time; query languages; syntax tree; tree structure; Art; Cultural differences; Data models; Database languages; Encoding; HTML; Object oriented databases; Tree data structures; Web sites; XML;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Engineering, 2001. Proceedings. 17th International Conference on
  • Conference_Location
    Heidelberg
  • ISSN
    1063-6382
  • Print_ISBN
    0-7695-1001-9
  • Type

    conf

  • DOI
    10.1109/ICDE.2001.914844
  • Filename
    914844