Title :
Word-level rate of speech modeling using rate-specific phones and pronunciations
Author :
Zheng, Jing ; Franco, Horacio ; Weng, Fuliang ; Sankar, Ananth ; Bratt, Harry
Author_Institution :
Speech Technol. & Res. Lab., SRI Int., Menlo Park, CA, USA
Abstract :
Variations in rate of speech (ROS) produce changes in both spectral features and word pronunciations that affect ASR systems. To cope with these effects, we propose to use rate specific phone models and pronunciations for ROS modeling at the word level. Words are given three types of pronunciations fast, slow, and medium-consisting of rate-specific phone models, respectively. This approach allows us to model within sentence rate variation. To better model coarticulation effects, we introduce the concept, of zero-length phones, which enables short phones to be skipped without having to change their neighboring phones´ contexts. A data-driven approach is used to prune the pronunciation dictionary derived from rules for phone reduction. We tested these approaches on the Hub 4 database and achieved a relative improvement of 2.0% over the baseline-an evaluation-quality version of SRI´s DECIPHER continuous speech recognition system-for clean native speech in the 1996 development set
Keywords :
hidden Markov models; speech recognition; ASR systems; automatic speech recognition; data-driven approach; pronunciation dictionary; pronunciations; rate-specific phones; spectral features; speech modeling; speech recognition; word pronunciations; word-level rate of speech modeling; zero-length phones; Automatic speech recognition; Context modeling; Databases; Dictionaries; Hidden Markov models; Laboratories; Speech analysis; Speech recognition; System testing; Training data;
Conference_Titel :
Acoustics, Speech, and Signal Processing, 2000. ICASSP '00. Proceedings. 2000 IEEE International Conference on
Conference_Location :
Istanbul
Print_ISBN :
0-7803-6293-4
DOI :
10.1109/ICASSP.2000.862097