• DocumentCode
    417306
  • Title

    Minimum Kullback-Leibler distance based multivariate Gaussian feature adaptation for distant-talking speech recognition

  • Author

    Pan, Yue ; Waibel, Alex

  • Author_Institution
    Interactive Syst. Labs., Carnegie Mellon Univ., Pittsburgh, PA, USA
  • Volume
    1
  • fYear
    2004
  • fDate
    17-21 May 2004
  • Abstract
    Multivariate Gaussian based speech compensation or mapping has been developed to reduce the mismatch between training and deployment conditions for robust speech recognition. The acoustic mapping procedure can be formulated as a feature space adaptation where a noisy input signal is transformed by a multivariate Gaussian network. We propose a novel algorithm to update the network parameters based on minimizing the Kullback-Leibler distance between the core recognizer´s acoustic model and transformed features. It is designed to achieve optimal overall system performance rather than MMSE on a specific feature domain. An online stochastic gradient descent learning rule is derived. We evaluate the performance of the new algorithm using a JRTk broadcast news system on a distance-talking speech corpus and compare its performance with that of previous MMSE based approaches. The experiments show the KL based approach is more effective for a large vocabulary continuous speech recognition (LVCSR) system.
  • Keywords
    Gaussian processes; acoustic noise; gradient methods; hidden Markov models; learning (artificial intelligence); minimisation; random noise; speech recognition; HMM; Kullback-Leibler distance; LVCSR; MMSE; acoustic mapping; distant-talking speech recognition; feature space adaptation; large vocabulary continuous speech recognition; multivariate Gaussian feature adaptation; multivariate Gaussian network; robust speech recognition; speech compensation; speech mapping; stochastic gradient descent learning rule; Acoustic noise; Broadcasting; Gaussian noise; Robustness; Signal mapping; Speech analysis; Speech recognition; Stochastic processes; System performance; Vocabulary;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech, and Signal Processing, 2004. Proceedings. (ICASSP '04). IEEE International Conference on
  • ISSN
    1520-6149
  • Print_ISBN
    0-7803-8484-9
  • Type

    conf

  • DOI
    10.1109/ICASSP.2004.1326164
  • Filename
    1326164