• DocumentCode
    3300336
  • Title

    Toward a Robust data fusion for document retrieval

  • Author

    He, Daqing ; Wu, Dan

  • Author_Institution
    Sch. of Inf. Sci., Univ. of Pittsburgh, Pittsburgh, PA
  • fYear
    2008
  • fDate
    19-22 Oct. 2008
  • Firstpage
    1
  • Lastpage
    8
  • Abstract
    This paper describes an investigation of signal boosting techniques for post-search data fusion, where the quality of the retrieval results involved in fusion may be low or diverse. The effectiveness of data fusion techniques in such situation depends on the ability of the fusion techniques to be able to boost the signals from relevant documents and reduce the effect of noise that often comes from low quality retrieval results. Our studies on Malach spoken document collection and HARD collection have demonstrated that CombMNZ, the most widely used data fusion method, does not have such ability. We, therefore, developed two versions of signal boosting mechanisms on top of CombMNZ, which result in two new fusion methods called WCombMNZ and WCombMWW. To examine the effectiveness of the two new methods, we conducted experiments on Malach and HARD document collections. Our results show that the new methods can significantly outperform CombMNZ in combining retrieval results that are low and diverse. When the tasks are to combine retrieval results that are in similar quality, which have been the scenarios that CombMNZ are applied often, the two new methods still can obtain often better, sometimes significantly, fusion results.
  • Keywords
    information retrieval; sensor fusion; HARD collection; Malach spoken document collection; WCombMNZ; WCombMWW; document retrieval; post-search data fusion; signal boosting techniques; Boosting; Diversity reception; Fusion power generation; Helium; Information management; Information resources; Information retrieval; Noise reduction; Robustness; Thesauri; CombMNZ; Data fusion; Malach; Spoken document retrieval; TREC HARD;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Natural Language Processing and Knowledge Engineering, 2008. NLP-KE '08. International Conference on
  • Conference_Location
    Beijing
  • Print_ISBN
    978-1-4244-4515-8
  • Electronic_ISBN
    978-1-4244-2780-2
  • Type

    conf

  • DOI
    10.1109/NLPKE.2008.4906754
  • Filename
    4906754