Title :
Building context-dependent DNN acoustic models using Kullback-Leibler divergence-based state tying
Author :
Gosztolya, Gabor ; Grosz, Tamas ; Toth, Laszlo ; Imseng, David
Author_Institution :
MTA-SZTE Res. Group on Artificial Intell., Szeged, Hungary
Abstract :
Deep neural network (DNN) based speech recognizers have recently replaced Gaussian mixture (GMM) based systems as the state-of-the-art. HMM/DNN systems have kept many refinements of the HMM/GMM framework, even though some of these may be suboptimal for them. One such example is the creation of context-dependent tied states, for which an efficient decision tree state tying method exists. The tied states used to train DNNs are usually obtained using the same tying algorithm, even though it is based on likelihoods of Gaussians. In this paper, we investigate an alternative state clustering method that uses the Kullback-Leibler (KL) divergence of DNN output vectors to build the decision tree. It has already been successfully applied within the framework of KL-HMM systems, and here we show that it is also beneficial for HMM/DNN hybrids. In a large vocabulary recognition task we report a 4% relative word error rate reduction using this state clustering method.
Keywords :
Gaussian distribution; acoustic signal processing; decision trees; hidden Markov models; learning (artificial intelligence); pattern clustering; speech recognition; DNN output vectors; KL-HMM systems; Kullback-Leibler divergence-based state tying method; context-dependent DNN acoustic models; decision tree state tying method; deep neural network based speech recognizers; relative word error rate reduction; state clustering method; vocabulary recognition task; Artificial neural networks; Context; Hidden Markov models; Speech; Kullback-Leibler divergence; Speech recognition; deep neural networks; state tying;
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on
Conference_Location :
South Brisbane, QLD
DOI :
10.1109/ICASSP.2015.7178836