• DocumentCode
    1910902
  • Title

    Exploiting Surface, Content and Relevance Features for Learning-Based Extractive Summarization

  • Author

    Wu, Mingli ; Li, Wenjie ; Wei, Furu ; Lu, Qin ; Wong, Kam-Fai

  • Author_Institution
    Chinese Univ. of Hong, Shatin
  • fYear
    2007
  • fDate
    Aug. 30 2007-Sept. 1 2007
  • Firstpage
    234
  • Lastpage
    241
  • Abstract
    Extractive summarization is to identify whether a sentence should be selected for inclusion in the summary or not. It can be transformed into a classification task. In this paper, we explore various features under a learning-based classification framework, including basic surface features, content features a sentence may represent and the features indicating the relevance among sentences. While surface and content features are about extrinsic and intrinsic aspects of a sentence itself, relevance features describe the strength of sentence related-ness. Sentences processed by classifiers are then feed to a re-ranking algorithm. The ones with higher priority are included in the summary. Experiments show that the proposed framework and the integrated features achieve competitive results on DUC 2001 document sets when evaluated by ROUGE. We find that relevance features are able to improve the summarization performance obviously.
  • Keywords
    abstracting; document handling; pattern classification; DUC 2001 document set; learning-based classification framework; learning-based extractive summarization; re-ranking algorithm; Algorithm design and analysis; Costs; Feature extraction; Feeds; Frequency; Statistics; Table lookup; Tellurium; Testing;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Natural Language Processing and Knowledge Engineering, 2007. NLP-KE 2007. International Conference on
  • Conference_Location
    Beijing
  • Print_ISBN
    978-1-4244-1611-0
  • Electronic_ISBN
    978-1-4244-1611-0
  • Type

    conf

  • DOI
    10.1109/NLPKE.2007.4368082
  • Filename
    4368082