DocumentCode :
1840727
Title :
Statistical Single-Document Summarization for Chinese News Articles
Author :
Wang, Jenq-Haur ; Yang, Jeng-Yuan
Author_Institution :
Dept. of Comput. Sci. & Inf. Eng., Nat. Taipei Univ. of Technol., Taipei, Taiwan
fYear :
2012
fDate :
26-29 March 2012
Firstpage :
183
Lastpage :
188
Abstract :
Given huge amount of daily news articles, it would be helpful to users if the news reading time can be reduced. In this paper, we focus on single-document summarization for Chinese news articles with statistical methods. First, new vocabularies are collected from news articles, and verified with online translation services. These are included as the auxiliary lexicon. Then, statistical word segmentation is done by calculating the relative frequency of overlapping word n-grams. Finally, the sentence importance is estimated as the weighted sum of n-gram scores, and the top-ranked sentences are selected as the summary. The experimental results showed that generated summaries can be effectively clustered in the same group as the original news articles. A great reduction in storage size can be observed while preserving suitable similarity with the original document. This shows the potential of our proposed approach in news summarization. Further investigation is needed to verify in other document domains.
Keywords :
document handling; electronic publishing; language translation; statistical analysis; word processing; Chinese news articles; auxiliary lexicon; group clustering; n-gram score; news summarization; online translation service; sentence importance; statistical method; statistical single-document summarization; statistical word segmentation; storage size reduction; top-ranked sentence; word n-gram overlapping; Feature extraction; Google; Measurement; Meteorology; Moon; Pragmatics; Semantics; news summarization; single-document summarization; text mining; word segmentation;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Advanced Information Networking and Applications Workshops (WAINA), 2012 26th International Conference on
Conference_Location :
Fukuoka
Print_ISBN :
978-1-4673-0867-0
Type :
conf
DOI :
10.1109/WAINA.2012.132
Filename :
6185120
Link To Document :
بازگشت