DocumentCode
3744846
Title
Incorporating paragraph embeddings and density peaks clustering for spoken document summarization
Author
Kuan-Yu Chen;Kai-Wun Shih;Shih-Hung Liu;Berlin Chen;Hsin-Min Wang
Author_Institution
Institute of Information Science, Academia Sinica, Taiwan
fYear
2015
Firstpage
207
Lastpage
214
Abstract
Representation learning has emerged as a newly active research subject in many machine learning applications because of its excellent performance. As an instantiation, word embedding has been widely used in the natural language processing area. However, as far as we are aware, there are relatively few studies investigating paragraph embedding methods in extractive text or speech summarization. Extractive summarization aims at selecting a set of indicative sentences from a source document to express the most important theme of the document. There is a general consensus that relevance and redundancy are both critical issues for users in a realistic summarization scenario. However, most of the existing methods focus on determining only the relevance degree between sentences and a given document, while the redundancy degree is calculated by a post-processing step. Based on these observations, three contributions are proposed in this paper. First, we comprehensively compare the word and paragraph embedding methods for spoken document summarization. Next, we propose a novel summarization framework which can take both relevance and redundancy information into account simultaneously. Consequently, a set of representative sentences can be automatically selected through a one-pass process. Third, we further plug in paragraph embedding methods into the proposed framework to enhance the summarization performance. Experimental results demonstrate the effectiveness of our proposed methods, compared to existing state-of-the-art methods.
Keywords
"Redundancy","Training","Context modeling","Predictive models","Context","Artificial neural networks"
Publisher
ieee
Conference_Titel
Automatic Speech Recognition and Understanding (ASRU), 2015 IEEE Workshop on
Type
conf
DOI
10.1109/ASRU.2015.7404796
Filename
7404796
Link To Document