• DocumentCode
    3329941
  • Title

    SNR-dependent non-uniform spectral compression for noisy speech recognition

  • Author

    Chu, K.K. ; Leung, S.H.

  • Author_Institution
    Dept. of Electron. Eng., City Univ. of Hong Kong, China
  • Volume
    1
  • fYear
    2004
  • fDate
    17-21 May 2004
  • Abstract
    It is known that the perceived loudness of a tone signal by a human is spectrally masked by background noise. This masking effect causes not only a shift of just-audible sound pressure level of the tone, but also produces a masked loudness function having steeper slope than the unmasked one. This masking property of perceived loudness stimulates us to propose a new mel-scale-based feature extraction method with non-uniform spectral compression for speech recognition in noisy environments. In this method, the speech power spectrum is to undergo mel-scaled band-pass filtering, as in the standard MFCC front-end. However, the energies of the outputs of the filters are compressed by different root values defined by a compression function. This compression function is a function of the SNR in each filter band. Using this new scheme of SNR-dependent non-uniform spectral compression (SNSC) for mel-scaled filter-bank-based cepstral coefficients, substantial improvement is found for recognition in different noisy environments, as compared to the standard MFCC and features derived with cubic root spectral compression.
  • Keywords
    band-pass filters; bandwidth compression; cepstral analysis; feature extraction; speech recognition; SNR-dependent nonuniform spectral compression; compression function; filter band; mel-scale-based feature extraction; mel-scaled band-pass filtering; mel-scaled filter-bank-based cepstral coefficients; noisy environments; nonuniform spectral compression; perceived loudness; spectral masking; speech power spectrum; speech recognition; Acoustic noise; Background noise; Band pass filters; Cepstral analysis; Feature extraction; Filtering; Humans; Mel frequency cepstral coefficient; Speech recognition; Working environment noise;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech, and Signal Processing, 2004. Proceedings. (ICASSP '04). IEEE International Conference on
  • ISSN
    1520-6149
  • Print_ISBN
    0-7803-8484-9
  • Type

    conf

  • DOI
    10.1109/ICASSP.2004.1326150
  • Filename
    1326150