• DocumentCode
    3744857
  • Title

    Towards utterance-based neural network adaptation in acoustic modeling

  • Author

    Ivan Himawan;Petr Motlicek;Marc Ferras Font;Srikanth Madikeri

  • Author_Institution
    Idiap Research Institute, Martigny, Switzerland
  • fYear
    2015
  • Firstpage
    289
  • Lastpage
    295
  • Abstract
    Despite the superior classification ability of deep neural networks (DNN), the performance of DNN suffers when there is a mismatch between training and testing conditions. Many speaker adaptation techniques have been proposed for DNN acoustic modeling but in case of environmental robustness the progress is still limited. It is also possible to use techniques developed for adapting speakers to handle the impact of environments at the same time, or to combine both approaches. Directly adapting the large number of DNN parameters is challenging when the adaptation set is small. The learning hidden unit contributions (LHUC) technique for unsupervised speaker adaptation of DNN introduces speaker dependent parameters to the existing speaker independent network to increase the automatic speech recognition (ASR) performance of the target speaker using small amounts of adaptation data. This paper investigates the LHUC to adapt the speech recognizer to target speakers and environments where the impacts of speakers and noise differences are quantified separately. Our finding shows that the LHUC is capable of adapting to both speaker and noise conditions at the same time. Compared to the speaker independent model, about 9% to 13% relative word error rate (WER) improvement are observed for all test conditions using AMI meeting corpus.
  • Keywords
    "Adaptation models","Hidden Markov models","Acoustics","Training","Speech","Data models","Signal to noise ratio"
  • Publisher
    ieee
  • Conference_Titel
    Automatic Speech Recognition and Understanding (ASRU), 2015 IEEE Workshop on
  • Type

    conf

  • DOI
    10.1109/ASRU.2015.7404807
  • Filename
    7404807