DocumentCode :
2894295
Title :
MSBGA: A Multi-Document Summarization System Based on Genetic Algorithm
Author :
He, Yan-Xiang ; Liu, De-xi ; Ji, Dong-hong ; Yang, Hua ; Teng, Chong
Author_Institution :
Sch. of Comput., Wuhan Univ.
fYear :
2006
fDate :
13-16 Aug. 2006
Firstpage :
2659
Lastpage :
2664
Abstract :
The multi-document summarizer using genetic algorithm-based sentence extraction (MSBGA) regards summarization process as an optimization problem where the optimal summary is chosen among a set of summaries formed by the conjunction of the original articles sentences. To solve the NP hard optimization problem, MSBGA adopts genetic algorithm, which can choose the optimal summary on global aspect. The evaluation function employs four features according to the criteria of a good summary: satisfied length, high coverage, high informativeness and low redundancy. To improve the accuracy of term frequency, MSBGA employs a novel method TFS, which takes word sense into account while calculating term frequency. The experiments on DUC04 data show that our strategy is effective and the ROUGE-1 score is only 0.55% lower than the best participant in DUC04
Keywords :
abstracting; genetic algorithms; text analysis; NP hard problem; genetic algorithm; multidocument summarization system; optimization; sentence extraction; term frequency; word sense; Clustering algorithms; Clustering methods; Coherence; Cybernetics; Data mining; Frequency; Fusion power generation; Genetic algorithms; Helium; Machine learning; Physics computing; Tides; MSBGA; Multi-document summarization; genetic algorithm; term frequency with sense (TFS);
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Machine Learning and Cybernetics, 2006 International Conference on
Conference_Location :
Dalian, China
Print_ISBN :
1-4244-0061-9
Type :
conf
DOI :
10.1109/ICMLC.2006.258921
Filename :
4028512
Link To Document :
بازگشت