DocumentCode :
179879
Title :
Regularized constrained maximum likelihood linear regression for speech recognition
Author :
Ghalehjegh, Sina Hamidi ; Rose, Richard C.
Author_Institution :
Dept. of Electr. & Comput. Eng., McGill Univ., Montreal, QC, Canada
fYear :
2014
fDate :
4-9 May 2014
Firstpage :
6319
Lastpage :
6323
Abstract :
The use of a graph embedding framework is investigated as a regularization technique in the expectation-maximization (EM) algorithm applied to automatic speech recognition (ASR). The technique is motivated by the fact that graph em-beddings of feature vectors have been shown to provide useful characterizations of the underlying manifolds on which these features lie. Incorporating intrinsic graphs that describe these manifolds in the optimization criteria for the EM algorithm has the effect of constraining the solution space in a way that preserves the local structure of the data. Graph embedding based regularization is applied here to estimating parameters in constrained maximum likelihood linear regression (CMLLR) speaker adaptation in continuous density hidden Markov model (CDHMM) based ASR. CMLLR adaptation has been widely used as a maximum likelihood procedure for reducing mismatch between a given HMM model and utterances from an unknown speaker through a linear feature space transformation. However, there is no guarantee that CMLLR transformations will preserve the relationships of the feature vectors along this manifold. It is argued here that graph embedding based regularization will preserve this structure. The impact of this approach on ASR performance is evaluated for unsupervised speaker adaptation on two large vocabulary speech corpora.
Keywords :
expectation-maximisation algorithm; graph theory; hidden Markov models; optimisation; regression analysis; speech recognition; vectors; ASR performance; CDHMM; CMLLR; EM algorithm; automatic speech recognition; continuous density hidden Markov model; expectation-maximization algorithm; feature vectors; graph embedding framework; intrinsic graphs; linear feature space transformation; optimization criteria; regularized constrained maximum likelihood linear regression; unsupervised speaker adaptation; vocabulary speech corpora; Adaptation models; Hidden Markov models; Manifolds; Speech; Speech recognition; Training; Vectors; Constrained MLLR; Graph embedding; Regularization; Speaker adaptation;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on
Conference_Location :
Florence
Type :
conf
DOI :
10.1109/ICASSP.2014.6854820
Filename :
6854820
Link To Document :
بازگشت