Title :
Unsupervised cross-adaptation using language model and deep learning based acoustic model adaptations
Author :
Takagi, Akira ; Konno, Kazuki ; Kato, Masaharu ; Kosaka, Tetsuo
Author_Institution :
Grad. Sch. of Sci. & Eng., Yamagata Univ., Yonezawa, Japan
Abstract :
It is well known that deep learning-based speech recognition improves performance significantly. In deep learning based systems, the deep neural network hidden Markov model (DNN-HMM) is used as an acoustic model (AM). Recently, speaker adaptation techniques based on DNN-HMM have also been investigated. The aim of this work is to improve the performance of unsupervised batch adaptation using DNN-HMM. The proposed adaptation method is based on the cross-adaptation approach, where complementary information derived from several systems is used. Gaussian mixture model HMM (GMM-HMM), DNN-HMM, and language model (LM) adaptation processes are conducted sequentially in the cross-adaptation procedure. The proposed adaptation method was evaluated on a Japanese lecture speech recognition task, reducing the error rate by 13.5% compared to the baseline DNN-HMM-based large vocabulary continuous speech recognition system.
Keywords :
Gaussian processes; hidden Markov models; learning (artificial intelligence); mixture models; neural nets; speech recognition; Gaussian mixture model; complementary information; deep learning based acoustic model adaptations; deep neural network hidden Markov model; language model; speaker adaptation techniques; speech recognition; unsupervised batch adaptation; unsupervised cross-adaptation; Acoustics; Adaptation models; Hidden Markov models; Neural networks; Speech; Speech recognition; Training;
Conference_Titel :
Asia-Pacific Signal and Information Processing Association, 2014 Annual Summit and Conference (APSIPA)
Conference_Location :
Siem Reap
DOI :
10.1109/APSIPA.2014.7041581