• DocumentCode
    353309
  • Title

    The regularized SNN-TA model for recognition of noisy speech

  • Author

    Trentin, Edmondo ; Matassoni, Marco

  • Author_Institution
    ITC-irst, Trento, Italy
  • Volume
    5
  • fYear
    2000
  • fDate
    2000
  • Firstpage
    97
  • Abstract
    The segmental neural network (SNN) architecture was introduced at BBN by Zavaliagkos et al. (1994) for rescoring the N-best hypothesis yielded by a standard continuous density hidden Markov model (CDHMM) applied to automatic speech recognition. An enhanced connectionist model, called SNN with trainable amplitude of activation functions (SNN-TA), presented for use instead of the CDHMM to perform the recognition of isolated words. Viterbi-based segmentation is then introduced, relying on the level building algorithm, that can be combined with the SNN-TA to obtain a hybrid framework for continuous speech recognition. The present paradigm is applied to the recognition of isolated digits, collected in a real car environment under several noisy conditions (traffic, speed, road conditions, etc.) using a microphone placed far from the talker. We stress the fact that robustness to noise can be increased by improving the generalization capabilities of the speech recognizer. In this perspective, while CDHMM completely lack of a proper regularization theory, a regularized SNN-TA model is discussed, which yields effective generalization and noise-tolerance, outperforming the CDHMM on the noisy task under consideration
  • Keywords
    hidden Markov models; neural nets; noise; speech recognition; CDHMM; HMM; N-best hypothesis; SNN architecture; Viterbi-based segmentation; activation functions; automatic speech recognition; car noise; continuous density hidden Markov model; continuous speech recognition; generalization; isolated word recognition; level building algorithm; noise robustness; noise-tolerance; noisy speech recognition; regularized SNN-TA model; road noise; segmental neural network architecture; Automatic speech recognition; Hidden Markov models; Microphones; Neural networks; Noise robustness; Roads; Speech recognition; Stress; Traffic control; Working environment noise;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Neural Networks, 2000. IJCNN 2000, Proceedings of the IEEE-INNS-ENNS International Joint Conference on
  • Conference_Location
    Como
  • ISSN
    1098-7576
  • Print_ISBN
    0-7695-0619-4
  • Type

    conf

  • DOI
    10.1109/IJCNN.2000.861441
  • Filename
    861441