• DocumentCode
    2894295
  • Title

    MSBGA: A Multi-Document Summarization System Based on Genetic Algorithm

  • Author

    He, Yan-Xiang ; Liu, De-xi ; Ji, Dong-hong ; Yang, Hua ; Teng, Chong

  • Author_Institution
    Sch. of Comput., Wuhan Univ.
  • fYear
    2006
  • fDate
    13-16 Aug. 2006
  • Firstpage
    2659
  • Lastpage
    2664
  • Abstract
    The multi-document summarizer using genetic algorithm-based sentence extraction (MSBGA) regards summarization process as an optimization problem where the optimal summary is chosen among a set of summaries formed by the conjunction of the original articles sentences. To solve the NP hard optimization problem, MSBGA adopts genetic algorithm, which can choose the optimal summary on global aspect. The evaluation function employs four features according to the criteria of a good summary: satisfied length, high coverage, high informativeness and low redundancy. To improve the accuracy of term frequency, MSBGA employs a novel method TFS, which takes word sense into account while calculating term frequency. The experiments on DUC04 data show that our strategy is effective and the ROUGE-1 score is only 0.55% lower than the best participant in DUC04
  • Keywords
    abstracting; genetic algorithms; text analysis; NP hard problem; genetic algorithm; multidocument summarization system; optimization; sentence extraction; term frequency; word sense; Clustering algorithms; Clustering methods; Coherence; Cybernetics; Data mining; Frequency; Fusion power generation; Genetic algorithms; Helium; Machine learning; Physics computing; Tides; MSBGA; Multi-document summarization; genetic algorithm; term frequency with sense (TFS);
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Machine Learning and Cybernetics, 2006 International Conference on
  • Conference_Location
    Dalian, China
  • Print_ISBN
    1-4244-0061-9
  • Type

    conf

  • DOI
    10.1109/ICMLC.2006.258921
  • Filename
    4028512