DocumentCode :
1691617
Title :
Deep neural networks with auxiliary Gaussian mixture models for real-time speech recognition
Author :
Xin Lei ; Hui Lin ; Heigold, Georg
Author_Institution :
Google Inc., Mountain View, CA, USA
fYear :
2013
Firstpage :
7634
Lastpage :
7638
Abstract :
We present a framework that improves real-time speech recognition performance using deep neural networks (DNNs) with auxiliary Gaussian mixture models (GMMs). The DNNs and the auxiliary GMMs share the same hidden Markov model (HMM) state inventory. First, online incremental feature-space adaptation is performed using the GMM acoustic model. The speaker-adapted features are used to improve the recognition performance of both GMM and DNN models. Second, the acoustic scores from GMMs and DNN are combined at the state-level during decoding. Experiments on a large vocabulary speech recognition task show that both approaches improve recognition performance consistently and that the gains are mostly additive, resulting in about 5% relative improvement over the competitive DNN baseline in both Portuguese and English systems.
Keywords :
Gaussian processes; decoding; hidden Markov models; neural nets; speech coding; speech recognition; DNN; GMM acoustic model; HMM state inventory; auxiliary Gaussian mixture model; decoding; deep neural network; hidden Markov model; large vocabulary speech recognition; online incremental feature-space adaptation; real-time speech recognition; speaker-adapted features; Acoustics; Adaptation models; Hidden Markov models; Speech; Speech recognition; Training; Vectors; DNN; GMM; speaker adaptation; system combination;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on
Conference_Location :
Vancouver, BC
ISSN :
1520-6149
Type :
conf
DOI :
10.1109/ICASSP.2013.6639148
Filename :
6639148
Link To Document :
بازگشت