DocumentCode
402858
Title
Sentences clustering based automatic summarization
Author
Wang, Jian-hui ; Zhou, Shui-geng ; Hu, Yun-fa
Author_Institution
Dept. of Comput. & Information Technol., Fudan Univ., Shanghai, China
Volume
1
fYear
2003
fDate
2-5 Nov. 2003
Firstpage
57
Abstract
There are two ways by which the research on automatic summarization is carried out. One is based on statistics, and the other is based on message understanding. The former has nothing to do with domain, but its accuracy is lower. On the contrary, the latter depends on domain, but its accuracy is higher. In this paper, an algorithm, which summarizes a document by extracting subtopics from the sentences, is based on statistics and partially understanding message, in order to get better summarization and get rid of the dependence on domain. Besides, since it is difficult to determine the length of a summary manually, the algorithm also strives to obtain a better summary with proper length. To this end, a new module of mutual dependence is put forward too and applied to segmentation, which can select accuracy features for the summarizing algorithm. And then new rules are brought forward to evaluate sentences for the summarizing algorithm. Furthermore, a new task based algorithm to evaluating summarization is impersonally offered.
Keywords
pattern clustering; statistics; text analysis; automatic summarization; message understanding; mutual dependence; segmentation; sentences clustering; statistics; Clustering algorithms; Concrete; Dictionaries; Information technology; Statistics;
fLanguage
English
Publisher
ieee
Conference_Titel
Machine Learning and Cybernetics, 2003 International Conference on
Print_ISBN
0-7803-8131-9
Type
conf
DOI
10.1109/ICMLC.2003.1264442
Filename
1264442
Link To Document