• DocumentCode
    188632
  • Title

    Text Summarization Based on Sentence Selection with Semantic Representation

  • Author

    Chi Zhang ; Lei Zhang ; Chong-Jun Wang ; Jun-Yuan Xie

  • Author_Institution
    Nat. Key Lab. for Novel Software Technol., Nanjing Univ., Nanjing, China
  • fYear
    2014
  • fDate
    10-12 Nov. 2014
  • Firstpage
    584
  • Lastpage
    590
  • Abstract
    Text summarization is of great importance to solve information overload. Salience and coverage are two most important issues for summaries. Most existing models extract summaries by selecting the top sentences with highest scores without using the relationships between sentences, and usually represent the sentences simply basing on lexical or statistical features. As a result, those models can not achieve salience or coverage very well. In this paper, we propose a novel summarization model called Sentence Selection with Semantic Representation (SSSR). SSSR ensures both salience and coverage by learning semantic representations for sentences and applying a well-designed selection strategy to select summary sentences. The selection strategy used in SSSR is to select sentences that can reconstruct the original document with least distortion by means of linear combination. Besides, we improve our selection strategy by reducing redundant information. Then we learn two semantic representations for sentences: (1) weighted mean of word embeddings, (2) deep coding. Both of them are semantic and compact, and can capture similarities between sentences. Extensive experiments on datasets DUC2006 and DUC2007 validate our model.
  • Keywords
    statistical analysis; text analysis; DUC2006; DUC2007; SSSR; deep coding; information overload; lexical features; redundant information; sentence selection with semantic representation; statistical features; summary sentences; text summarization; word embeddings; Computational modeling; Encoding; Feature extraction; Hidden Markov models; Redundancy; Semantics; Vectors; document summarization; lasso; selection strategy; sentence representation; word2vec;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Tools with Artificial Intelligence (ICTAI), 2014 IEEE 26th International Conference on
  • Conference_Location
    Limassol
  • ISSN
    1082-3409
  • Type

    conf

  • DOI
    10.1109/ICTAI.2014.93
  • Filename
    6984529