DocumentCode
3141523
Title
Multi-document summarization based on hierarchical topic model
Author
Liu, Hongyan ; Li, Lei
Author_Institution
Center for Intell. Sci. & Technol., Beijing Univ. of Posts & Telecommun., Beijing, China
fYear
2011
fDate
27-29 Nov. 2011
Firstpage
88
Lastpage
91
Abstract
In this paper, we introduced an extractive multi-document summarization method based on hierarchical topic model of hierarchical Latent Dirichlet Allocation (hLDA) and sentences compression. hLDA is a representative generative probabilistic model, which not only can mine latent topics from a large amount of discrete data, but also can organize these topics into a hierarchy to achieve a deeper semantic analysis. At the same time we also use sentence compression technology to refine the summaries, making them more concise. We use TAC 2010 data sets as the experimental test corpus and ROUGE method to evaluate our summaries. The evaluations confirmed that our method has better performance than some traditional methods.
Keywords
data compression; document handling; statistical analysis; ROUGE method; hierarchical latent Dirichlet allocation model; hierarchical topic model; multidocument summarization; semantic analysis; sentence compression technology; hierarchical Latent Dirichlet Allocationtopic model; multi-document summarization; sentence compression;
fLanguage
English
Publisher
ieee
Conference_Titel
Natural Language Processing andKnowledge Engineering (NLP-KE), 2011 7th International Conference on
Conference_Location
Tokushima
Print_ISBN
978-1-61284-729-0
Type
conf
DOI
10.1109/NLPKE.2011.6138174
Filename
6138174
Link To Document