DocumentCode :
1499381
Title :
Segmental modeling using a continuous mixture of nonparametric models
Author :
Goldberger, Jacob ; Burshtein, David ; Franco, Horacio
Author_Institution :
Tel Aviv Univ., Israel
Volume :
7
Issue :
3
fYear :
1999
fDate :
5/1/1999 12:00:00 AM
Firstpage :
262
Lastpage :
271
Abstract :
A major limitation of hidden Markov model (HMM) based automatic speech recognition is the inherent assumption that successive observations within a state are independent and identically distributed (i.i.d.). The i.i.d. assumption is reasonable for some of the states (e.g., a state that corresponds to a steady state vowel). However, most states clearly violate this assumption (e.g., states corresponding to vowel-consonant transition, diphthongs, etc.) and are in fact characterized by a highly correlated and nonstationary speech signal. Previous alternative models have been proposed, that attempt to describe the dynamics of the signal within a phonetic unit. The new approach is generally known by the name segmental modeling, since the speech signal is modeled on a segment level base and not on a frame base (such as HMM). We propose a family of new segmental models that are composed of two elements. The first element is a nonparametric representation of the mean and variance trajectories, and the second is some parameterized transformation (e.g., random shift) of the trajectory that is global to the entire segment. The new model is in fact a continuous mixture of segment trajectories. We present recognition results on a large vocabulary task, and compare the model to alternative segment models on a triphone recognition task
Keywords :
acoustic signal processing; correlation methods; hidden Markov models; nonparametric statistics; signal representation; speech recognition; HMM; automatic speech recognition; continuous mixture; correlated speech signal; diphthongs; hidden Markov model; independent identically distributed observations; mean trajectory; nonparametric models; nonparametric representation; nonstationary speech signal; parameterized transformation; phonetic unit; random shift; recognition results; segmental modeling; signal dynamics; speech signal; steady state vowel; triphone recognition task; variance trajectory; vowel-consonant transition; Automatic speech recognition; Covariance matrix; Hidden Markov models; Jacobian matrices; Loudspeakers; Probability distribution; Speech processing; Speech recognition; Steady-state; Vocabulary;
fLanguage :
English
Journal_Title :
Speech and Audio Processing, IEEE Transactions on
Publisher :
ieee
ISSN :
1063-6676
Type :
jour
DOI :
10.1109/89.759032
Filename :
759032
Link To Document :
بازگشت