• DocumentCode
    3340336
  • Title

    Perceptual harmonic cepstral coefficients for speech recognition in noisy environment

  • Author

    Gu, Liang ; Rose, Kenneth

  • Author_Institution
    Dept. of Electr. & Comput. Eng., California Univ., Santa Barbara, CA, USA
  • Volume
    1
  • fYear
    2001
  • fDate
    2001
  • Firstpage
    125
  • Abstract
    Perceptual harmonic cepstral coefficients (PHCC) are proposed as features to extract from speech for recognition in noisy environments. A weighting function, which depends on the prominence of the harmonic structure, is applied to the power spectrum to ensure accurate representation of the voiced speech spectral envelope. The harmonics´ weighted power spectrum undergoes mel-scaled band-pass filtering, and the log-energy of the filters´ output is discrete cosine transformed to produce cepstral coefficients. Lower spectral clipping is applied to the power spectrum, followed by within-filter root-power amplitude compression to reduce amplitude variation without compromise of the gain invariance properties. Experiments show significant recognition gains of PHCC over MFCC, with 23% and 36% error rate reduction for the Mandarin digit database in white and babble noise environments
  • Keywords
    acoustic noise; band-pass filters; cepstral analysis; data compression; discrete cosine transforms; feature extraction; speech recognition; white noise; Mandarin digit database; PHCC; amplitude variation; babble noise; cepstral coefficients; discrete cosine transform; error rate reduction; gain invariance; harmonic structure; harmonics weighted power spectrum; log-energy; lower spectral clipping; mel-scaled band-pass filtering; noisy environment; perceptual harmonic cepstral coefficients; power spectrum; speech recognition; voiced speech spectral envelope; weighting function; white noise; within-filter root-power amplitude compression; Band pass filters; Cepstral analysis; Error analysis; Feature extraction; Filtering; Mel frequency cepstral coefficient; Power harmonic filters; Power system harmonics; Speech recognition; Working environment noise;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech, and Signal Processing, 2001. Proceedings. (ICASSP '01). 2001 IEEE International Conference on
  • Conference_Location
    Salt Lake City, UT
  • ISSN
    1520-6149
  • Print_ISBN
    0-7803-7041-4
  • Type

    conf

  • DOI
    10.1109/ICASSP.2001.940783
  • Filename
    940783