Title :
Improved lexicon modeling for continuous speech recognition
Author :
Yun, Seong Jin ; Oh, Yung Hwan ; Shin, Gyung Chul
Author_Institution :
Dept. of Comput. Sci., Korea Adv. Inst. of Sci. & Technol., Taejon, South Korea
Abstract :
We propose the stochastic lexicon model which represents the pronunciation variations to optimally cope with the continuous speech recognizer. In this lexicon model, the baseform of words are represented by subword states and the probability distribution of subwords as a hidden Markov model. Also, the proposed approach can be applied to a system employing non-linguistic recognition units and the lexicon is automatically trained from training utterances. In speaker independent speech recognition tests using a 3000 word continuous speech database, the proposed system improves the word accuracy by about 27.8% and the sentence accuracy by about 22.4%
Keywords :
hidden Markov models; probability; speech processing; speech recognition; stochastic processes; continuous speech database; continuous speech recognition; hidden Markov mode; lexicon modeling; nonlinguistic recognition units; pronunciation variations; sentence accuracy; speaker independent speech recognition tests; stochastic lexicon model; subword states; subwords probability distribution; training utterances; word accuracy; word baseform; Automatic speech recognition; Computer science; Databases; Electronic mail; Hidden Markov models; Probability distribution; Speech recognition; Stochastic processes; Stochastic systems; System testing;
Conference_Titel :
Acoustics, Speech, and Signal Processing, 1997. ICASSP-97., 1997 IEEE International Conference on
Print_ISBN :
0-8186-7919-0
DOI :
10.1109/ICASSP.1997.598892