DocumentCode
2703152
Title
Pronunciation Modeling for Spontaneous Speech Recognition using Latent Pronunciation Analysis (LPA) and Prior Knowledge
Author
Che-Kuang Lin ; Lin-Shan Lee
Author_Institution
Nat. Taiwan Univ., Taipei, Taiwan
Volume
4
fYear
2007
fDate
15-20 April 2007
Abstract
In this paper, we propose a new framework for pronunciation modeling, in which the search algorithm tries to focus primarily on the clearly-pronounced portion of speech, while deemphasizing the observations of the slurred portion. This is based on the prior analysis that the pronunciation variation has to do with the predictability and the importance of the words in the spoken utterances, which may be estimated to some extent. We define a set of pronunciation-related features and develop a latent pronunciation analysis (LPA) to estimate the "latent pronunciation states" in the speech. The LPA probabilities, pronunciation-related features and another set of prior knowledge obtained from two distance measures between phonemes are integrated in a SVM classifier to produce a "pronunciation variation indicator" for each frame, based on which the Viterbi decoding was performed. Very encouraging initial results on Mandarin spontaneous speech were obtained in preliminary experiments.
Keywords
Viterbi decoding; feature extraction; probability; search problems; speech coding; speech recognition; support vector machines; Mandarin spontaneous speech; SVM; Viterbi decoding; latent pronunciation analysis; prior knowledge; pronunciation modeling; pronunciation variation indicator; pronunciation-related features; search algorithm; spoken utterances; Algorithm design and analysis; Decoding; Hidden Markov models; Speech analysis; Speech processing; Speech recognition; State estimation; Support vector machine classification; Support vector machines; Viterbi algorithm; Distance metrics; Probabilistic Latent Semantic Analysis; Pronunciation variation; speech recognition; spontaneous speech;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics, Speech and Signal Processing, 2007. ICASSP 2007. IEEE International Conference on
Conference_Location
Honolulu, HI
ISSN
1520-6149
Print_ISBN
1-4244-0727-3
Type
conf
DOI
10.1109/ICASSP.2007.367002
Filename
4218190
Link To Document