Pronunciation Modeling for Spontaneous Speech Recognition using Latent Pronunciation Analysis (LPA) and Prior Knowledge

Author

Che-Kuang Lin ; Lin-Shan Lee

Author_Institution

Nat. Taiwan Univ., Taipei, Taiwan

Volume

4

fYear

2007

fDate

15-20 April 2007

Abstract

In this paper, we propose a new framework for pronunciation modeling, in which the search algorithm tries to focus primarily on the clearly-pronounced portion of speech, while deemphasizing the observations of the slurred portion. This is based on the prior analysis that the pronunciation variation has to do with the predictability and the importance of the words in the spoken utterances, which may be estimated to some extent. We define a set of pronunciation-related features and develop a latent pronunciation analysis (LPA) to estimate the "latent pronunciation states" in the speech. The LPA probabilities, pronunciation-related features and another set of prior knowledge obtained from two distance measures between phonemes are integrated in a SVM classifier to produce a "pronunciation variation indicator" for each frame, based on which the Viterbi decoding was performed. Very encouraging initial results on Mandarin spontaneous speech were obtained in preliminary experiments.

Keywords

Viterbi decoding; feature extraction; probability; search problems; speech coding; speech recognition; support vector machines; Mandarin spontaneous speech; SVM; Viterbi decoding; latent pronunciation analysis; prior knowledge; pronunciation modeling; pronunciation variation indicator; pronunciation-related features; search algorithm; spoken utterances; Algorithm design and analysis; Decoding; Hidden Markov models; Speech analysis; Speech processing; Speech recognition; State estimation; Support vector machine classification; Support vector machines; Viterbi algorithm; Distance metrics; Probabilistic Latent Semantic Analysis; Pronunciation variation; speech recognition; spontaneous speech;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics, Speech and Signal Processing, 2007. ICASSP 2007. IEEE International Conference on

Conference_Location

Honolulu, HI

ISSN

1520-6149

Print_ISBN

1-4244-0727-3

Type

conf

DOI

10.1109/ICASSP.2007.367002

Filename

4218190