• DocumentCode
    2177362
  • Title

    A hierarchical, context-dependent neural network architecture for improved phone recognition

  • Author

    Tóth, László

  • Author_Institution
    Res. Group on Artificial Intell., Univ. of Szeged, Szeged, Hungary
  • fYear
    2011
  • fDate
    22-27 May 2011
  • Firstpage
    5040
  • Lastpage
    5043
  • Abstract
    In this paper we combine three simple refinements proposed recently to improve HMM/ANN hybrid models. The first refinement is to apply a hierarchy of two nets, where the second net models the contextual relations of the state posteriors produced by the first network. The second idea is to train the network on context-dependent units (HMM states) instead of context-independent phones or phone states. As the latter refinement results in a lot of output neurons, combining the two methods directly would be problematic. Hence the third trick is to shrink the output layer of the first net using the bottleneck technique before applying the second net on top of it. The phone recognition results obtained on the TIMIT database demonstrate that both the context-dependent and the 2-stage modeling methods can bring about marked improvements. Using them in combination, however, results in a further significant gain in accuracy. With the bottleneck technique a further improvement can be obtained, especially when the number of context-dependent units is large.
  • Keywords
    hidden Markov models; neural nets; speech recognition; HMM-ANN hybrid model; TIMIT database; bottleneck technique; context-dependent neural network architecture; phone recognition; Artificial neural networks; Decoding; Error analysis; Hidden Markov models; Neurons; Speech recognition; Training; HMM/ANN; MLP; Phone recognition; TIMIT; bottleneck;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing (ICASSP), 2011 IEEE International Conference on
  • Conference_Location
    Prague
  • ISSN
    1520-6149
  • Print_ISBN
    978-1-4577-0538-0
  • Electronic_ISBN
    1520-6149
  • Type

    conf

  • DOI
    10.1109/ICASSP.2011.5947489
  • Filename
    5947489