DocumentCode
3300336
Title
Toward a Robust data fusion for document retrieval
Author
He, Daqing ; Wu, Dan
Author_Institution
Sch. of Inf. Sci., Univ. of Pittsburgh, Pittsburgh, PA
fYear
2008
fDate
19-22 Oct. 2008
Firstpage
1
Lastpage
8
Abstract
This paper describes an investigation of signal boosting techniques for post-search data fusion, where the quality of the retrieval results involved in fusion may be low or diverse. The effectiveness of data fusion techniques in such situation depends on the ability of the fusion techniques to be able to boost the signals from relevant documents and reduce the effect of noise that often comes from low quality retrieval results. Our studies on Malach spoken document collection and HARD collection have demonstrated that CombMNZ, the most widely used data fusion method, does not have such ability. We, therefore, developed two versions of signal boosting mechanisms on top of CombMNZ, which result in two new fusion methods called WCombMNZ and WCombMWW. To examine the effectiveness of the two new methods, we conducted experiments on Malach and HARD document collections. Our results show that the new methods can significantly outperform CombMNZ in combining retrieval results that are low and diverse. When the tasks are to combine retrieval results that are in similar quality, which have been the scenarios that CombMNZ are applied often, the two new methods still can obtain often better, sometimes significantly, fusion results.
Keywords
information retrieval; sensor fusion; HARD collection; Malach spoken document collection; WCombMNZ; WCombMWW; document retrieval; post-search data fusion; signal boosting techniques; Boosting; Diversity reception; Fusion power generation; Helium; Information management; Information resources; Information retrieval; Noise reduction; Robustness; Thesauri; CombMNZ; Data fusion; Malach; Spoken document retrieval; TREC HARD;
fLanguage
English
Publisher
ieee
Conference_Titel
Natural Language Processing and Knowledge Engineering, 2008. NLP-KE '08. International Conference on
Conference_Location
Beijing
Print_ISBN
978-1-4244-4515-8
Electronic_ISBN
978-1-4244-2780-2
Type
conf
DOI
10.1109/NLPKE.2008.4906754
Filename
4906754
Link To Document