DocumentCode
245328
Title
Combining N-Gram and Dependency Word Pair for Multi-document Summarization
Author
Yungang Ma ; Ji Wu
Author_Institution
Dept. of Electron. Eng., Tsinghua Univ. Beijing, Beijing, China
fYear
2014
fDate
19-21 Dec. 2014
Firstpage
27
Lastpage
31
Abstract
This paper proposes a method for extractive multi-document summarization based on the combined features of n-grams co-occurrences and dependency word pairs co-occurrences. Unigram is the basic text unit, Big ram and skip-big ram reflect the word sequential relationships in the sentences, Dependency word pairs describe the syntactic relationships between words. The co-occurrences of each feature reflect the common topics of multiple documents in different perspective. The score of a sentence is the weighted sum of the features it contains. The summary is generated by extracting salient sentences based on the maximum significance score model. This approach obtains higher ROUGE scores than several well-known methods on the TAC summarization dataset.
Keywords
natural language processing; text analysis; ROUGE scores; TAC summarization dataset; dependency word pairs cooccurrences; multidocument summarization; n-grams cooccurrences; skip-big ram; unigram; Educational institutions; Feature extraction; Manuals; Natural language processing; Redundancy; Semantics; Syntactics; co-occurrence; dependency word pair; extractive; multi-document; n-gram; summarization;
fLanguage
English
Publisher
ieee
Conference_Titel
Computational Science and Engineering (CSE), 2014 IEEE 17th International Conference on
Conference_Location
Chengdu
Print_ISBN
978-1-4799-7980-6
Type
conf
DOI
10.1109/CSE.2014.39
Filename
7023550
Link To Document