مرکز منطقه ای اطلاع رساني علوم و فناوري - Context adaptive deep neural networks for fast acoustic model adaptation

DocumentCode :

730708

Title :

Context adaptive deep neural networks for fast acoustic model adaptation

Author :

Delcroix, Marc ; Kinoshita, Keisuke ; Hori, Takaaki ; Nakatani, Tomohiro

Author_Institution :

NTT Commun. Sci. Labs., NTT Corp., Kyoto, Japan

fYear :

2015

fDate :

19-24 April 2015

Firstpage :

4535

Lastpage :

4539

Abstract :

Deep neural networks (DNNs) are widely used for acoustic modeling in automatic speech recognition (ASR), since they greatly outperform legacy Gaussian mixture model-based systems. However, the levels of performance achieved by current DNN-based systems remain far too low in many tasks, e.g. when the training and testing acoustic contexts differ due to ambient noise, reverberation or speaker variability. Consequently, research on DNN adaptation has recently attracted much interest. In this paper, we present a novel approach for the fast adaptation of a DNN-based acoustic model to the acoustic context. We introduce a context adaptive DNN with one or several layers depending on external factors that represent the acoustic conditions. This is realized by introducing a factorized layer that uses a different set of parameters to process each class of factors. The output of the factorized layer is then obtained by weighted averaging over the contribution of the different factor classes, given posteriors over the factor classes. This paper introduces the concept of context adaptive DNN and describes preliminary experiments with the TIMIT phoneme recognition task showing consistent improvement with the proposed approach.

Keywords :

acoustic signal processing; neural nets; speech recognition; ASR; DNN-based acoustic model; TIMIT phoneme recognition task; acoustic modeling; automatic speech recognition; context adaptive deep neural networks; external factors; fast acoustic model adaptation; weighted averaging; Acoustics; Adaptation models; Context; Neural networks; Training; Training data; Tuning; Acoustic model adaptation; Automatic speech recognition; Context adaptive DNN; Deep neural networks; Factorized DNN;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on

Conference_Location :

South Brisbane, QLD

Type :

conf

DOI :

10.1109/ICASSP.2015.7178829

Filename :

7178829

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=730708