• DocumentCode
    3430506
  • Title

    Improving deep neural network acoustic models using unlabeled data

  • Author

    Meng Cai ; Wei-Qiang Zhang ; Jia Liu

  • Author_Institution
    Dept. of Electron. Eng., Tsinghua Univ., Beijing, China
  • fYear
    2013
  • fDate
    6-10 July 2013
  • Firstpage
    137
  • Lastpage
    141
  • Abstract
    The Context-Dependent Deep-Neural-Network HMM, or CD-DNN-HMM, is a powerful acoustic modeling technique. Its training process typically involves unsupervised pre-training and supervised fine-tuning. In the paper, we demonstrate that the performance of DNNs can be improved by utilizing a large amount of unlabeled data in the training procedure. In our method, CD-DNN-HMM trained using 309 hours of unlabeled data and 24 hours of labeled data achieved word-error rate of 23.7% on the Hub5´00-SWB phone-call transcription task, compared to word-error rate of 24.3% obtained by a CD-DNN-HMM trained without using unlabeled data. We also applied a priori probability smoothing algorithm that further reduced the error rate to 23.2%. On RT03S-FSH benchmark corpus, our experimental results show that similar performance gains can be obtained by the use of unlabeled data.
  • Keywords
    error statistics; hidden Markov models; neural nets; probability; speech recognition; CD-DNN-HMM; RT03S-FSH benchmark corpus; a priori probability smoothing algorithm; acoustic modeling technique; context-dependent deep-neural-network; phone-call transcription task; supervised fine-tuning; unsupervised pretraining; word-error rate; Data models; Hidden Markov models; Neural networks; Speech; Speech recognition; Switches; Training; acoustic modeling; deep neural network; speech recognition; unlabeled data;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Signal and Information Processing (ChinaSIP), 2013 IEEE China Summit & International Conference on
  • Conference_Location
    Beijing
  • Type

    conf

  • DOI
    10.1109/ChinaSIP.2013.6625314
  • Filename
    6625314