DocumentCode
2208019
Title
Multi-document Summarization Using Minimum Distortion
Author
Ma, Tengfei ; Wan, Xiaojun
Author_Institution
MOE Key Lab. of Comput. Linguistics, Peking Univ., Beijing, China
fYear
2010
fDate
13-17 Dec. 2010
Firstpage
354
Lastpage
363
Abstract
Document summarization plays an important role in the area of natural language processing and text mining. This paper proposes several novel information-theoretic models for multi-document summarization. They consider document summarization as a transmission system and assume that the best summary should have the minimum distortion. By defining a proper distortion measure and a new representation method, the combination of the last two models (the linear representation model and the facility location model) gains good experimental results on the DUC2002 and DUC2004 datasets. Moreover, we also indicate that the model has high interpretability and extensibility.
Keywords
data mining; knowledge representation; natural language processing; text analysis; document summarization; information-theoretic model; natural language processing; text mining; J-S Divergence; information-theoretic summarization; linear representation; minimum distortion; multi-document summarization;
fLanguage
English
Publisher
ieee
Conference_Titel
Data Mining (ICDM), 2010 IEEE 10th International Conference on
Conference_Location
Sydney, NSW
ISSN
1550-4786
Print_ISBN
978-1-4244-9131-5
Electronic_ISBN
1550-4786
Type
conf
DOI
10.1109/ICDM.2010.106
Filename
5693989
Link To Document