• DocumentCode
    2043626
  • Title

    Feature extraction using discrete wavelet transform for speech recognition

  • Author

    Tufekci, Z. ; Gowdy, J.N.

  • Author_Institution
    Dept. of Electr. & Comput. Eng., Clemson Univ., SC, USA
  • fYear
    2000
  • fDate
    2000
  • Firstpage
    116
  • Lastpage
    123
  • Abstract
    We propose a new feature vector consisting of mel-frequency discrete wavelet coefficients (MFDWC). The MFDWC are obtained by applying the discrete wavelet transform (DWT) to the mel-scaled log filterbank energies of a speech frame. The purpose of using the DWT is to benefit from its localization property in the time and frequency domains. MFDWC are similar to subband-based (SUB) features and multi-resolution (MULT) features in that both attempt to achieve good time and frequency localization. However, MFDWC have better time/frequency localization than SUB features and MULT features. We evaluated the performance of new features for clean speech and noisy speech and compared the performance of MFDWC with mel-frequency cepstral coefficients (MFCC), SUB features and MULT features. Experimental results on a phoneme recognition task showed that a MFDWC-based recognizer gave better results than recognizers based on MFCC, SUB features, and MULT features for white Gaussian noise, band-limited white Gaussian noise and clean speech cases
  • Keywords
    AWGN; bandlimited signals; cepstral analysis; channel bank filters; discrete wavelet transforms; feature extraction; filtering theory; frequency-domain analysis; signal resolution; speech recognition; time-domain analysis; DWT; MFDWC; bandlimited white Gaussian noise; clean speech; discrete wavelet transform; experimental results; feature extraction; frequency domain; frequency localization; localization property; mel-frequency cepstral coefficients; mel-frequency discrete wavelet coefficients; mel-scaled log filterbank energies; multi-resolution features; noisy speech; performance evaluation; phoneme recognition task; speech frame; speech recognition; subband-based features; time domain; time localization; Cepstral analysis; Discrete wavelet transforms; Feature extraction; Filter bank; Frequency domain analysis; Gaussian noise; Mel frequency cepstral coefficient; Speech analysis; Speech recognition; Wavelet coefficients;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Southeastcon 2000. Proceedings of the IEEE
  • Conference_Location
    Nashville, TN
  • Print_ISBN
    0-7803-6312-4
  • Type

    conf

  • DOI
    10.1109/SECON.2000.845444
  • Filename
    845444