Title :
Using corpus and knowledge-based similarity measure in Maximum Marginal Relevance for meeting summarization
Author :
Xie, Shasha ; Liu, Yang
Author_Institution :
Univ. of Texas at Dallas, Richardson, TX
fDate :
March 31 2008-April 4 2008
Abstract :
MMR (maximum marginal relevance) is widely used in summarization for its simplicity and efficacy, and has been demonstrated to achieve comparable performance to other approaches for meeting summarization. How to appropriately represent the similarity of two text segments is crucial in MMR. In this paper, we evaluate different similarity measures in the MMR framework for meeting summarization on the ICSI meeting corpus. We introduce a corpus- based measure to capture the similarity at the semantic level, and compare this method with cosine similarity and centroid score that only considers the salient words in the segments. Our experimental results evaluated by the ROUGE summarization metrics show that both the centroid score and the corpus-based similarity measure yield better performance than the commonly used cosine similarity. In addition, adding part-of-speech information in the corpus-based approach helps for the human transcripts condition, but not when using ASR output.
Keywords :
text analysis; ICSI meeting corpus; cosine similarity; knowledge-based similarity measure; maximum marginal relevance; meeting summarization; text segments; Automatic speech recognition; Data mining; Entropy; Hidden Markov models; Humans; Information analysis; Speech processing; Statistical learning; Support vector machines; Text processing; MMR; centroid score; corpus-based similarity; meeting summarization;
Conference_Titel :
Acoustics, Speech and Signal Processing, 2008. ICASSP 2008. IEEE International Conference on
Conference_Location :
Las Vegas, NV
Print_ISBN :
978-1-4244-1483-3
Electronic_ISBN :
1520-6149
DOI :
10.1109/ICASSP.2008.4518777