Title :
Improving deep neural network acoustic models using unlabeled data
Author :
Meng Cai ; Wei-Qiang Zhang ; Jia Liu
Author_Institution :
Dept. of Electron. Eng., Tsinghua Univ., Beijing, China
Abstract :
The Context-Dependent Deep-Neural-Network HMM, or CD-DNN-HMM, is a powerful acoustic modeling technique. Its training process typically involves unsupervised pre-training and supervised fine-tuning. In the paper, we demonstrate that the performance of DNNs can be improved by utilizing a large amount of unlabeled data in the training procedure. In our method, CD-DNN-HMM trained using 309 hours of unlabeled data and 24 hours of labeled data achieved word-error rate of 23.7% on the Hub5´00-SWB phone-call transcription task, compared to word-error rate of 24.3% obtained by a CD-DNN-HMM trained without using unlabeled data. We also applied a priori probability smoothing algorithm that further reduced the error rate to 23.2%. On RT03S-FSH benchmark corpus, our experimental results show that similar performance gains can be obtained by the use of unlabeled data.
Keywords :
error statistics; hidden Markov models; neural nets; probability; speech recognition; CD-DNN-HMM; RT03S-FSH benchmark corpus; a priori probability smoothing algorithm; acoustic modeling technique; context-dependent deep-neural-network; phone-call transcription task; supervised fine-tuning; unsupervised pretraining; word-error rate; Data models; Hidden Markov models; Neural networks; Speech; Speech recognition; Switches; Training; acoustic modeling; deep neural network; speech recognition; unlabeled data;
Conference_Titel :
Signal and Information Processing (ChinaSIP), 2013 IEEE China Summit & International Conference on
Conference_Location :
Beijing
DOI :
10.1109/ChinaSIP.2013.6625314