• DocumentCode
    245328
  • Title

    Combining N-Gram and Dependency Word Pair for Multi-document Summarization

  • Author

    Yungang Ma ; Ji Wu

  • Author_Institution
    Dept. of Electron. Eng., Tsinghua Univ. Beijing, Beijing, China
  • fYear
    2014
  • fDate
    19-21 Dec. 2014
  • Firstpage
    27
  • Lastpage
    31
  • Abstract
    This paper proposes a method for extractive multi-document summarization based on the combined features of n-grams co-occurrences and dependency word pairs co-occurrences. Unigram is the basic text unit, Big ram and skip-big ram reflect the word sequential relationships in the sentences, Dependency word pairs describe the syntactic relationships between words. The co-occurrences of each feature reflect the common topics of multiple documents in different perspective. The score of a sentence is the weighted sum of the features it contains. The summary is generated by extracting salient sentences based on the maximum significance score model. This approach obtains higher ROUGE scores than several well-known methods on the TAC summarization dataset.
  • Keywords
    natural language processing; text analysis; ROUGE scores; TAC summarization dataset; dependency word pairs cooccurrences; multidocument summarization; n-grams cooccurrences; skip-big ram; unigram; Educational institutions; Feature extraction; Manuals; Natural language processing; Redundancy; Semantics; Syntactics; co-occurrence; dependency word pair; extractive; multi-document; n-gram; summarization;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computational Science and Engineering (CSE), 2014 IEEE 17th International Conference on
  • Conference_Location
    Chengdu
  • Print_ISBN
    978-1-4799-7980-6
  • Type

    conf

  • DOI
    10.1109/CSE.2014.39
  • Filename
    7023550