• DocumentCode
    2798846
  • Title

    Magnitude spectrum enhancement for robust speech recognition

  • Author

    Tu, Wen-Hsiang ; Hung, Jeih-weih

  • Author_Institution
    Dept of Electr. Eng., Nat. Chi Nan Univ., Taiwan
  • fYear
    2010
  • fDate
    14-19 March 2010
  • Firstpage
    4586
  • Lastpage
    4589
  • Abstract
    In this paper, an effective compensation scheme for the spectra of speech signals is proposed in order to improve their noise robustness. In this compensation scheme, named magnitude spectrum enhancement (MSE), a voice activity detection (VAD) process is first processed for the frame sequence of the utterance, and then the magnitude spectra of non-speech frames are set to be small while those of speech frames are amplified. In experiments conducted on the Aurora-2 noisy digits database, MSE achieves a relative error reduction rate of nearly 50% from the baseline processing, which outperforms the well-known spectral-domain speech enhancement techniques, spectral subtraction (SS) and Wiener filtering (WF). In addition, the proposed MSE can be integrated with cepstral-domain robustness methods, like mean and variance normalization (MVN) and histogram normalization (HEQ), to achieve further improved recognition accuracy under noise-corrupted environments.
  • Keywords
    Wiener filters; spectral analysis; speech enhancement; speech recognition; Aurora-2 noisy digit database; Wiener filtering; cepstral-domain robustness method; frame sequence; histogram normalization; magnitude spectrum enhancement; mean and variance normalization; noise robustness; robust speech recognition; spectral subtraction; spectral-domain speech enhancement; speech signal; voice activity detection; Cepstral analysis; Electronic mail; Histograms; Mel frequency cepstral coefficient; Noise robustness; Speech enhancement; Speech processing; Speech recognition; Wiener filter; Working environment noise; robust speech features; speech enhancement; speech recognition;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on
  • Conference_Location
    Dallas, TX
  • ISSN
    1520-6149
  • Print_ISBN
    978-1-4244-4295-9
  • Electronic_ISBN
    1520-6149
  • Type

    conf

  • DOI
    10.1109/ICASSP.2010.5495556
  • Filename
    5495556