Title :
Mixture of topic model for multi-document summarization
Author :
Liu Na ; Li Ming-Xia ; Lu Ying ; Tang Xiao-jun ; Wang Hai-Wen ; Xiao Peng
Author_Institution :
Sch. of Inf. Sci. & Eng., Dalian Polytech. Univ., Dalian, China
fDate :
May 31 2014-June 2 2014
Abstract :
Based on LDA(Latent Dirichlet Allocation) topic model, a generative model for multi-document summarization, namely Titled-LDA that simultaneously models the content of documents and the titles of document is proposed. This generative model represents each document with a mixture of topics, and extends these approaches to title modeling by allowing the mixture weights for topics to be determined by the titles of the document. In the mixing stage, the algorithm can learn the weight in an adaptive asymmetric learning way based on two kinds of information entropies. In this way, the final model incorporated the title information and the content information appropriately, which helped the performance of summarization. The experiments showed that the proposed algorithm achieved better performance compared the other state-of-the-art algorithms on DUC2002 corpus.
Keywords :
entropy; learning (artificial intelligence); text analysis; adaptive asymmetric learning; content information; document content; document titles; generative model; information entropies; latent Dirichlet allocation topic model; mixing stage; multidocument summarization; summarization performance; title information; titled-LDA; topic mixture weights; topic model mixture; Adaptation models; Computational linguistics; Computational modeling; Data mining; Information entropy; Mathematical model; Resource management; LDA; multi-document summarization; topic model;
Conference_Titel :
Control and Decision Conference (2014 CCDC), The 26th Chinese
Conference_Location :
Changsha
Print_ISBN :
978-1-4799-3707-3
DOI :
10.1109/CCDC.2014.6853102