DocumentCode
1749660
Title
Duration normalization for improved recognition of spontaneous and read speech via missing feature methods
Author
Nedel, Jon P. ; Stern, Richard M.
Author_Institution
Dept. of Electr. & Comput. Eng., Carnegie Mellon Univ., Pittsburgh, PA, USA
Volume
1
fYear
2001
fDate
2001
Firstpage
313
Abstract
Hidden Markov models (HMMs) are known to model the duration of sound units poorly. We present a technique to normalize the duration of each phone to overcome this weakness, with the conjecture that speech with normalized phone durations may be better modeled and discriminated using standard HMM acoustic models. Duration normalization is accomplished by dropping frames if a phone is longer than the desired duration and by adding "missing" frames and reconstructing them if a phone is shorter than the desired duration. If phone segmentations are known a priori, we achieve a 15.8% reduction in relative word error rate (WER) on spontaneous speech and a 10.3% reduction in relative WER on read speech. Preliminary work with automatic phone segmentations derived from the data is also presented
Keywords
hidden Markov models; signal reconstruction; speech recognition; HMM acoustic models; automatic phone segmentation; hidden Markov models; missing feature methods; missing frames reconstruction; phone duration normalization; read speech recognition; spontaneous speech recognition; word error rate; Computer science; Convolution; Hidden Markov models; Modems; Natural languages; Reconstruction algorithms; Speech recognition; Stochastic processes; Trajectory;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics, Speech, and Signal Processing, 2001. Proceedings. (ICASSP '01). 2001 IEEE International Conference on
Conference_Location
Salt Lake City, UT
ISSN
1520-6149
Print_ISBN
0-7803-7041-4
Type
conf
DOI
10.1109/ICASSP.2001.940830
Filename
940830
Link To Document