• DocumentCode
    730713
  • Title

    Building context-dependent DNN acoustic models using Kullback-Leibler divergence-based state tying

  • Author

    Gosztolya, Gabor ; Grosz, Tamas ; Toth, Laszlo ; Imseng, David

  • Author_Institution
    MTA-SZTE Res. Group on Artificial Intell., Szeged, Hungary
  • fYear
    2015
  • fDate
    19-24 April 2015
  • Firstpage
    4570
  • Lastpage
    4574
  • Abstract
    Deep neural network (DNN) based speech recognizers have recently replaced Gaussian mixture (GMM) based systems as the state-of-the-art. HMM/DNN systems have kept many refinements of the HMM/GMM framework, even though some of these may be suboptimal for them. One such example is the creation of context-dependent tied states, for which an efficient decision tree state tying method exists. The tied states used to train DNNs are usually obtained using the same tying algorithm, even though it is based on likelihoods of Gaussians. In this paper, we investigate an alternative state clustering method that uses the Kullback-Leibler (KL) divergence of DNN output vectors to build the decision tree. It has already been successfully applied within the framework of KL-HMM systems, and here we show that it is also beneficial for HMM/DNN hybrids. In a large vocabulary recognition task we report a 4% relative word error rate reduction using this state clustering method.
  • Keywords
    Gaussian distribution; acoustic signal processing; decision trees; hidden Markov models; learning (artificial intelligence); pattern clustering; speech recognition; DNN output vectors; KL-HMM systems; Kullback-Leibler divergence-based state tying method; context-dependent DNN acoustic models; decision tree state tying method; deep neural network based speech recognizers; relative word error rate reduction; state clustering method; vocabulary recognition task; Artificial neural networks; Context; Hidden Markov models; Speech; Kullback-Leibler divergence; Speech recognition; deep neural networks; state tying;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on
  • Conference_Location
    South Brisbane, QLD
  • Type

    conf

  • DOI
    10.1109/ICASSP.2015.7178836
  • Filename
    7178836