• DocumentCode
    3341119
  • Title

    High-performance robust speech recognition using stereo training data

  • Author

    Deng, Li ; Acero, Alex ; Jiang, Li ; Droppo, Jasha ; Huang, Xuedong

  • Author_Institution
    Microsoft Res., Redmond, WA, USA
  • Volume
    1
  • fYear
    2001
  • fDate
    2001
  • Firstpage
    301
  • Abstract
    We describe a novel technique of SPLICE (Stereo-based Piecewise Linear Compensation for Environments) for high performance robust speech recognition. It is an efficient noise reduction and channel distortion compensation technique that makes effective use of stereo training data. We present a new version of SPLICE using the minimum-mean-square-error decision, and describe an extension by training clusters of hidden Markov models (HMMs) with SPLICE processing. Comprehensive results using a Wall Street Journal large vocabulary recognition task and with a wide range of noise types demonstrate the superior performance of the SPLICE technique over that under noisy matched conditions (19% word error rate reduction). The new technique is also shown to consistently outperform the spectral-subtraction noise reduction technique, and is currently being integrated into the Microsoft MiPad, a new generation PDA prototype
  • Keywords
    hidden Markov models; least mean squares methods; noise; speech recognition; HMM; MMSE decision; Microsoft MiPad; PDA prototype; SPLICE; Wall Street Journal; channel distortion compensation; hidden Markov models; large vocabulary speech recognition; minimum-meansquare-error decision; noise reduction; noisy matched conditions; robust speech recognition; spectral -subtraction noise reduction; stereo training data; stereo-based piecewise linear compensation; word error rate reduction; Acoustic noise; Cepstral analysis; Hidden Markov models; Noise reduction; Noise robustness; Partitioning algorithms; Speech recognition; Training data; Vocabulary; Working environment noise;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech, and Signal Processing, 2001. Proceedings. (ICASSP '01). 2001 IEEE International Conference on
  • Conference_Location
    Salt Lake City, UT
  • ISSN
    1520-6149
  • Print_ISBN
    0-7803-7041-4
  • Type

    conf

  • DOI
    10.1109/ICASSP.2001.940827
  • Filename
    940827