Title :
MSBGA: A Multi-Document Summarization System Based on Genetic Algorithm
Author :
He, Yan-Xiang ; Liu, De-xi ; Ji, Dong-hong ; Yang, Hua ; Teng, Chong
Author_Institution :
Sch. of Comput., Wuhan Univ.
Abstract :
The multi-document summarizer using genetic algorithm-based sentence extraction (MSBGA) regards summarization process as an optimization problem where the optimal summary is chosen among a set of summaries formed by the conjunction of the original articles sentences. To solve the NP hard optimization problem, MSBGA adopts genetic algorithm, which can choose the optimal summary on global aspect. The evaluation function employs four features according to the criteria of a good summary: satisfied length, high coverage, high informativeness and low redundancy. To improve the accuracy of term frequency, MSBGA employs a novel method TFS, which takes word sense into account while calculating term frequency. The experiments on DUC04 data show that our strategy is effective and the ROUGE-1 score is only 0.55% lower than the best participant in DUC04
Keywords :
abstracting; genetic algorithms; text analysis; NP hard problem; genetic algorithm; multidocument summarization system; optimization; sentence extraction; term frequency; word sense; Clustering algorithms; Clustering methods; Coherence; Cybernetics; Data mining; Frequency; Fusion power generation; Genetic algorithms; Helium; Machine learning; Physics computing; Tides; MSBGA; Multi-document summarization; genetic algorithm; term frequency with sense (TFS);
Conference_Titel :
Machine Learning and Cybernetics, 2006 International Conference on
Conference_Location :
Dalian, China
Print_ISBN :
1-4244-0061-9
DOI :
10.1109/ICMLC.2006.258921