• DocumentCode
    1791747
  • Title

    Topological models of document-query sets in retrieval for Enterprise Information Management

  • Author

    Deolalikar, Vinay

  • fYear
    2014
  • fDate
    27-30 Oct. 2014
  • Firstpage
    18
  • Lastpage
    23
  • Abstract
    The tasks, challenges, and techniques of Information Retrieval (IR) should reflect the structure of the underlying document-query sets, and the needs of the domain. Are document-query sets obtained from the enterprise domain fundamentally different from standard research corpora gathered from the web? In order to identify, understand, and characterize such structural differences, we build a framework using point set topology to analyze document-query sets. Our framework tailors topological notions such as subbasis, cover, compactness, towards IR. Unlike previous topological approaches, we use the reverse of the relevance map to topologize the set of queries, not the set of documents. We show that the topological approach exposes sharp differences between enterprise and web-collected standard research document-query sets. These differences readily motivate research into new retrieval tasks that are of commercial importance in Enterprise Information Management (EIM).
  • Keywords
    Internet; document handling; information management; query processing; topology; EIM; IR; Web-collected standard research document-query sets; enterprise information management; information retrieval; relevance map; topological models; topological notions; Benchmark testing; Indexes; Information management; Information retrieval; Law; Standards; Topology;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Big Data (Big Data), 2014 IEEE International Conference on
  • Conference_Location
    Washington, DC
  • Type

    conf

  • DOI
    10.1109/BigData.2014.7004426
  • Filename
    7004426