DocumentCode
3426746
Title
Rhetorical-State Hidden Markov Models for extractive speech summarization
Author
Fung, Pascale ; Chan, Ricky Ho Yin ; Zhang, Justin Jian
Author_Institution
Dept. of Electron. & Comput. Eng., Human Language Technol. Center, Hong Kong Univ. of Sci. & Technol., Hong Kong
fYear
2008
fDate
March 31 2008-April 4 2008
Firstpage
4957
Lastpage
4960
Abstract
We propose an extractive summarization system with a novel non-generative probabilistic framework for speech summarization. One of the most underutilized features in extractive summarization is rhetorical information - semantically cohesive units that are hidden in spoken documents. We propose Rhetorical-State Hidden Markov Models (RSHMMs) to automatically decode this underlying structure in speech. We show that RSHMMs give a 71.69% ROUGE-L F-measure, a 5.69% absolute increase in lecture speech summarization performance compared to the baseline system without using RSHMM. It equally outperforms the baseline system with additional discourse features, showing that our RSHMM is a more refined improvement on the conventional discourse feature.
Keywords
hidden Markov models; speech processing; baseline system; extractive speech summarization; nongenerative probabilistic framework; rhetorical information; rhetorical-state hidden Markov models; Automatic speech recognition; Data mining; Decoding; Feature extraction; Hidden Markov models; Humans; Natural languages; Support vector machine classification; Support vector machines; Text recognition; hidden Markov models; rhetorical information; speech features; spoken document summarization;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics, Speech and Signal Processing, 2008. ICASSP 2008. IEEE International Conference on
Conference_Location
Las Vegas, NV
ISSN
1520-6149
Print_ISBN
978-1-4244-1483-3
Electronic_ISBN
1520-6149
Type
conf
DOI
10.1109/ICASSP.2008.4518770
Filename
4518770
Link To Document