Title :
Online ngram-enhanced topic model for academic retrieval
Author :
Wang, Han ; Lang, Bo
Author_Institution :
State Key Lab. of Software Dev. Environ., Beihang Univ., Beijing, China
Abstract :
Applying topic model to text mining has achieved a great success. However, state-of-art topic modeling methods still have potential to improve in academic retrieval field. In this paper, we propose an online unified topic model, which is ngram-enhanced. Our model discovers topics with unigrams as well as topical bigrams and is updated by an online inference algorithm with the new incoming data streams. On this basis, we combine our model into the query likelihood model and develop an integrated academic searching system. Experiment results on ACM collection show that our proposed methods outperform the existing approaches on document modeling and searching accuracy. Especially, we prove the efficiency of our system on academic retrieval problem.
Keywords :
data mining; inference mechanisms; information retrieval; maximum likelihood estimation; text analysis; academic retrieval; document modeling; document search; integrated academic searching system; ngram enhanced topic model; online inference algorithm; query likelihood model; text mining; topical bigram; unigram; Computational modeling; Data models; Inference algorithms; Mathematical model; Neodymium; Object oriented modeling; Predictive models;
Conference_Titel :
Digital Information Management (ICDIM), 2011 Sixth International Conference on
Conference_Location :
Melbourn, QLD
Print_ISBN :
978-1-4577-1538-9
DOI :
10.1109/ICDIM.2011.6093316