• DocumentCode
    2195147
  • Title

    Enhancing Document Exploration with OLAP

  • Author

    Chen, Zhibo ; Garcia-Alvarado, Carlos ; Ordonez, Carlos

  • Author_Institution
    Univ. of Houston, Houston, TX, USA
  • fYear
    2010
  • fDate
    13-13 Dec. 2010
  • Firstpage
    1407
  • Lastpage
    1410
  • Abstract
    Finding relevant documents in digital libraries has been a well studied problem in information retrieval. It is not uncommon to see users browsing digital collections without having a clear idea of the keyword search that they should perform. However, we believe that such initial query search is not totally independent from the target search. Therefore, we use these initial document selections to further explore these documents. In the following demonstration, we exploit On-line Analytical Processing (OLAP) for knowledge discovery in digital collections to achieve query refinement. Such refinement is the result of applying a traditional ranking technique, based on the vector space model, selecting the top keywords in the resulting subset of documents, and then displaying certain cuboids of the keywords. Based on these cuboids, which are ranked by their frequency, the users can select a query that can better represent their actual target search. We show that this document exploration can be done efficiently within the DBMS and exploit in-database extensions, such as User-Defined Functions, as well as standard SQL. Additionally, we demonstrate a novel approach to obtaining query refinement through OLAP data cubes.
  • Keywords
    SQL; data mining; digital libraries; document handling; information retrieval; query processing; search problems; DBMS; OLAP data cube; digital library; document selection; enhancing document exploration; in-database extension; information retrieval; keyword search; knowledge discovery; online analytical processing; query refinement; query search; ranking technique; standard SQL; user browsing; user defined function; vector space model; Information Retrieval; OLAP; UDF;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Mining Workshops (ICDMW), 2010 IEEE International Conference on
  • Conference_Location
    Sydney, NSW
  • Print_ISBN
    978-1-4244-9244-2
  • Electronic_ISBN
    978-0-7695-4257-7
  • Type

    conf

  • DOI
    10.1109/ICDMW.2010.37
  • Filename
    5693464