Title :
Promoting Ranking Diversity for Biomedical Information Retrieval Based on LDA
Author :
Chen, Yan ; Yin, Xiaoshi ; Li, Zhoujun ; Hu, Xiaohua ; Huang, Jimmy Xiangji
Author_Institution :
State Key Lab. of Software Dev. Environ., Beihang Univ., Beijing, China
Abstract :
In this paper, we propose an approach based on a topic generative model called Latent Dirichlet Allocation (LDA) to promoting ranking diversity for biomedical information retrieval. Different from other approaches or models which consider aspects on word level, our approach assumes that aspects should be identified by the topics of retrieved documents. We present LDA model to discover topic distribution of retrieval passages and word distribution of each topic dimension, and then re-rank retrieval results with topic distribution similarity between passages based on JV-size slide window. Experiments on TREC 2007 Genomics collection and two distinctive IR baseline runs demonstrate the effectiveness of our method in promoting ranking diversity for biomedical information retrieval. Evaluation results show that our approach can achieve 8% improvement over the highest Aspect MAP reported in TREC 2007 Genomics track.
Keywords :
information retrieval; medical computing; statistics; IR baseline; JV-size slide window; LDA model; TREC 2007 Genomics collection; biomedical information retrieval; latent Dirichlet allocation; ranking diversity promotion; retrieval passage topic distribution; topic dimension word distribution; topic generative model; word level; Bioinformatics; Biological system modeling; Educational institutions; Genomics; Information retrieval; LDA; biomedical IR; ranking diversity;
Conference_Titel :
Bioinformatics and Biomedicine (BIBM), 2011 IEEE International Conference on
Conference_Location :
Atlanta, GA
Print_ISBN :
978-1-4577-1799-4
DOI :
10.1109/BIBM.2011.28