• DocumentCode
    730708
  • Title

    Context adaptive deep neural networks for fast acoustic model adaptation

  • Author

    Delcroix, Marc ; Kinoshita, Keisuke ; Hori, Takaaki ; Nakatani, Tomohiro

  • Author_Institution
    NTT Commun. Sci. Labs., NTT Corp., Kyoto, Japan
  • fYear
    2015
  • fDate
    19-24 April 2015
  • Firstpage
    4535
  • Lastpage
    4539
  • Abstract
    Deep neural networks (DNNs) are widely used for acoustic modeling in automatic speech recognition (ASR), since they greatly outperform legacy Gaussian mixture model-based systems. However, the levels of performance achieved by current DNN-based systems remain far too low in many tasks, e.g. when the training and testing acoustic contexts differ due to ambient noise, reverberation or speaker variability. Consequently, research on DNN adaptation has recently attracted much interest. In this paper, we present a novel approach for the fast adaptation of a DNN-based acoustic model to the acoustic context. We introduce a context adaptive DNN with one or several layers depending on external factors that represent the acoustic conditions. This is realized by introducing a factorized layer that uses a different set of parameters to process each class of factors. The output of the factorized layer is then obtained by weighted averaging over the contribution of the different factor classes, given posteriors over the factor classes. This paper introduces the concept of context adaptive DNN and describes preliminary experiments with the TIMIT phoneme recognition task showing consistent improvement with the proposed approach.
  • Keywords
    acoustic signal processing; neural nets; speech recognition; ASR; DNN-based acoustic model; TIMIT phoneme recognition task; acoustic modeling; automatic speech recognition; context adaptive deep neural networks; external factors; fast acoustic model adaptation; weighted averaging; Acoustics; Adaptation models; Context; Neural networks; Training; Training data; Tuning; Acoustic model adaptation; Automatic speech recognition; Context adaptive DNN; Deep neural networks; Factorized DNN;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on
  • Conference_Location
    South Brisbane, QLD
  • Type

    conf

  • DOI
    10.1109/ICASSP.2015.7178829
  • Filename
    7178829