• DocumentCode
    672375
  • Title

    Impact of deep MLP architecture on different acoustic modeling techniques for under-resourced speech recognition

  • Author

    Imseng, David ; Motlicek, Petr ; Garner, Philip N. ; Bourlard, Herve

  • Author_Institution
    Idiap Res. Inst., Martigny, Switzerland
  • fYear
    2013
  • fDate
    8-12 Dec. 2013
  • Firstpage
    332
  • Lastpage
    337
  • Abstract
    Posterior based acoustic modeling techniques such as Kullback-Leibler divergence based HMM (KL-HMM) and Tandem are able to exploit out-of-language data through posterior features, estimated by a Multi-Layer Perceptron (MLP). In this paper, we investigate the performance of posterior based approaches in the context of under-resourced speech recognition when a standard three-layer MLP is replaced by a deeper five-layer MLP. The deeper MLP architecture yields similar gains of about 15% (relative) for Tandem, KL-HMM as well as for a hybrid HMM/MLP system that directly uses the posterior estimates as emission probabilities. The best performing system, a bilingual KL-HMM based on a deep MLP, jointly trained on Afrikaans and Dutch data, performs 13% better than a hybrid system using the same bilingual MLP and 26% better than a subspace Gaussian mixture system only trained on Afrikaans data.
  • Keywords
    Gaussian processes; hidden Markov models; maximum likelihood estimation; mixture models; multilayer perceptrons; probability; speech recognition; Afrikaans data; Dutch data; KL-HMM; Kullback-Leibler divergence based HMM; Tandem; hybrid HMM-MLP system; multilayer perceptron; posterior based acoustic modeling technique; probability; standard three-layer deep MLP architecture; subspace Gaussian mixture system; under-resourced speech recognition; Acoustics; Context modeling; Hidden Markov models; Speech; Speech recognition; Standards; Training; KL-HMM; Tandem; deep MLPs; hybrid system; under-resourced speech recognition;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Automatic Speech Recognition and Understanding (ASRU), 2013 IEEE Workshop on
  • Conference_Location
    Olomouc
  • Type

    conf

  • DOI
    10.1109/ASRU.2013.6707752
  • Filename
    6707752