DocumentCode :
3752213
Title :
Deep neural network-based speech recognition with combination of speaker-class models
Author :
Tetsuo Kosaka;Kazuki Konno;Masaharu Kato
Author_Institution :
Graduate School of Science and Engineering, Yamagata University, Yonezawa, Japan
fYear :
2015
Firstpage :
1203
Lastpage :
1206
Abstract :
This paper proposes a new speech recognition method based on speaker-class (SC) models. In previous studies based on this approach, Gaussian-mixture-model-based hidden Markov models (GMM-HMMs) have mainly been used as acoustic models. In this work, SC models that have deep neural network (DNN)-based HMM (DNN-HMM) structures are investigated and used for speaker-independent (SI) speech recognition. To realize SI speech recognition based on SC models, technological challenges must be solved so that unsupervised adaptation can be performed with only one utterance. To address this problem, we propose a new method of combining DNN outputs. In our experiments, five of 963 SC models were selected automatically, and DNN-HMM-based SC models were combined for each utterance. The results showed that the proposed method outperformed a baseline DNN-HMM system.
Keywords :
"Hidden Markov models","Speech recognition","Adaptation models","Silicon","Clustering algorithms","Training","Probability"
Publisher :
ieee
Conference_Titel :
Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2015 Asia-Pacific
Type :
conf
DOI :
10.1109/APSIPA.2015.7415464
Filename :
7415464
Link To Document :
بازگشت