DocumentCode :
180410
Title :
On modeling context-dependent clustered states: Comparing HMM/GMM, hybrid HMM/ANN and KL-HMM approaches
Author :
Razavi, Mohsen ; Rasipuram, Ramya ; Magimai-Doss, Mathew
Author_Institution :
Idiap Res. Inst., Martigny, Switzerland
fYear :
2014
fDate :
4-9 May 2014
Firstpage :
7659
Lastpage :
7663
Abstract :
Deep architectures have recently been explored in hybrid hidden Markov model/artificial neural network (HMM/ANN) framework where the ANN outputs are usually the clustered states of context-dependent phones derived from the best performing HMM/Gaussian mixture model (GMM) system. We can view a hybrid HMM/ANN system as a special case of recently proposed Kullback-Leibler divergence based hidden Markov model (KL-HMM) approach. In KL-HMM approach a probabilistic relationship between the ANN outputs and the context-dependent HMM states is modeled. In this paper, we show that in KL-HMM framework we may not require as many clustered states as the best HMM/GMM system in the ANN output layer. Our experimental results on German part of Media-Parl database show that KL-HMM system achieves better performance compared to hybrid HMM/ANN and HMM/GMM systems with much fewer number of clustered states than is required for HMM/GMM system. The reduction in number of clustered states has broader implications on model complexity and data sparsity issues.
Keywords :
Gaussian processes; hidden Markov models; mixture models; neural nets; speech recognition; HMM-GMM approach; HMM-Gaussian mixture model system; HYBRID HMM-ANN approach; KL-HMM approach; Kullback-Leibler divergence based hidden Markov model approach; Media-Parl database; context-dependent phones; data sparsity; hybrid hidden Markov model-artificial neural network; model complexity; number context-dependent clustered state modeling; Acoustics; Artificial neural networks; Context modeling; Hidden Markov models; Probabilistic logic; Speech; Speech recognition; HMM/GMM; Kullback-Leibler divergence based HMM; context-dependent subword units; hybrid HMM/ANN; non-native speech recognition;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on
Conference_Location :
Florence
Type :
conf
DOI :
10.1109/ICASSP.2014.6855090
Filename :
6855090
Link To Document :
بازگشت