DocumentCode
2894295
Title
MSBGA: A Multi-Document Summarization System Based on Genetic Algorithm
Author
He, Yan-Xiang ; Liu, De-xi ; Ji, Dong-hong ; Yang, Hua ; Teng, Chong
Author_Institution
Sch. of Comput., Wuhan Univ.
fYear
2006
fDate
13-16 Aug. 2006
Firstpage
2659
Lastpage
2664
Abstract
The multi-document summarizer using genetic algorithm-based sentence extraction (MSBGA) regards summarization process as an optimization problem where the optimal summary is chosen among a set of summaries formed by the conjunction of the original articles sentences. To solve the NP hard optimization problem, MSBGA adopts genetic algorithm, which can choose the optimal summary on global aspect. The evaluation function employs four features according to the criteria of a good summary: satisfied length, high coverage, high informativeness and low redundancy. To improve the accuracy of term frequency, MSBGA employs a novel method TFS, which takes word sense into account while calculating term frequency. The experiments on DUC04 data show that our strategy is effective and the ROUGE-1 score is only 0.55% lower than the best participant in DUC04
Keywords
abstracting; genetic algorithms; text analysis; NP hard problem; genetic algorithm; multidocument summarization system; optimization; sentence extraction; term frequency; word sense; Clustering algorithms; Clustering methods; Coherence; Cybernetics; Data mining; Frequency; Fusion power generation; Genetic algorithms; Helium; Machine learning; Physics computing; Tides; MSBGA; Multi-document summarization; genetic algorithm; term frequency with sense (TFS);
fLanguage
English
Publisher
ieee
Conference_Titel
Machine Learning and Cybernetics, 2006 International Conference on
Conference_Location
Dalian, China
Print_ISBN
1-4244-0061-9
Type
conf
DOI
10.1109/ICMLC.2006.258921
Filename
4028512
Link To Document