DocumentCode :
1348266
Title :
High resolution speech feature parametrization for monophone-based stressed speech recognition
Author :
Sarikaya, Ruhi ; Hansen, John H L
Author_Institution :
Robust Speech Process. Lab., Colorado Univ., Boulder, CO, USA
Volume :
7
Issue :
7
fYear :
2000
fDate :
7/1/2000 12:00:00 AM
Firstpage :
182
Lastpage :
185
Abstract :
This letter investigates the impact of stress on monophone speech recognition accuracy and proposes a new set of acoustic parameters based on high resolution wavelet analysis. The two parameter schemes are entitled wavelet packet parameters (WPP) and subband-based cepstral parameters (SBC). The performance of these features is compared to traditional Mel-frequency cepstral coefficients (MFCC) for stressed speech monophone recognition. The stressed speaking styles considered are neutral, angry, loud, and Lombard effect speech from the SUSAS database. An overall monophone recognition improvement of 20.4% and 17.2% is achieved for loud and angry stressed speech, with a corresponding increase in the neutral monophone rate of 9.9% over MFCC parameters.
Keywords :
cepstral analysis; feature extraction; speech recognition; time-frequency analysis; wavelet transforms; Lombard effect speech; Mel-frequency cepstral coefficients; SUSAS database; acoustic parameters; angry speaking style; high resolution wavelet analysis; loud speaking style; monophone speech recognition; neutral speaking style; stress impact; stressed speech recognition; subband-based cepstral parameters; wavelet packet parameters; Cepstral analysis; Mel frequency cepstral coefficient; Spatial databases; Speech analysis; Speech recognition; Stress; Testing; Vocabulary; Wavelet analysis; Wavelet packets;
fLanguage :
English
Journal_Title :
Signal Processing Letters, IEEE
Publisher :
ieee
ISSN :
1070-9908
Type :
jour
DOI :
10.1109/97.847363
Filename :
847363
Link To Document :
بازگشت