• DocumentCode
    2548583
  • Title

    Categorizing Queries by Topic Directory

  • Author

    He, Miao ; Cutler, Michal ; Wu, Kelvin

  • Author_Institution
    Comput. Sci. Dept., Binghamton Univ., Binghamton, NY
  • fYear
    2008
  • fDate
    20-22 July 2008
  • Firstpage
    278
  • Lastpage
    284
  • Abstract
    The categorization of a Web user query by topic or category can be used to select useful Web sources that contain the required information. In pursuit of this goal, we explore methods for mapping user queries to category hierarchies under which deep Web resources are also assumed to be classified. Our sources for these category hierarchies, or directories, are Yahoo! Directory and Wikipedia. Forwarding an unrefined query (in our case a typical fact finding query sent to a question answering system) directly to these directory resources usually returns no directories or incorrect ones. Instead, we develop techniques to generate more specific directory finding queries from an unrefined query and use these to retrieve better directories. Despite these engineered queries, our two resources often return multiple directories that include many incorrect results, i.e., directories whose categories are not related to the query, and thus Web resources for these categories are unlikely to contain the required information. We develop methods for selecting the most useful ones. We consider a directory to be useful if Web sources for any of its narrow categories are likely to contain the searched for information. We evaluate our mapping system on a set of 250 TREC questions and obtain precision and recall in the 0.8 to 1.0 range.
  • Keywords
    Internet; Web sites; query processing; search engines; Web resources; Web user query; Wikipedia; Yahoo! Directory; query categorization; Computer science; Engines; Helium; Information management; Kelvin; Manufacturing; Metasearch; Optical fiber devices; Textile fibers; Wikipedia; Web directory; query answering system; query categorization;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Web-Age Information Management, 2008. WAIM '08. The Ninth International Conference on
  • Conference_Location
    Zhangjiajie Hunan
  • Print_ISBN
    978-0-7695-3185-4
  • Electronic_ISBN
    978-0-7695-3185-4
  • Type

    conf

  • DOI
    10.1109/WAIM.2008.82
  • Filename
    4597025