Title :
High resolution speech feature parametrization for monophone-based stressed speech recognition
Author :
Sarikaya, Ruhi ; Hansen, John H L
Author_Institution :
Robust Speech Process. Lab., Colorado Univ., Boulder, CO, USA
fDate :
7/1/2000 12:00:00 AM
Abstract :
This letter investigates the impact of stress on monophone speech recognition accuracy and proposes a new set of acoustic parameters based on high resolution wavelet analysis. The two parameter schemes are entitled wavelet packet parameters (WPP) and subband-based cepstral parameters (SBC). The performance of these features is compared to traditional Mel-frequency cepstral coefficients (MFCC) for stressed speech monophone recognition. The stressed speaking styles considered are neutral, angry, loud, and Lombard effect speech from the SUSAS database. An overall monophone recognition improvement of 20.4% and 17.2% is achieved for loud and angry stressed speech, with a corresponding increase in the neutral monophone rate of 9.9% over MFCC parameters.
Keywords :
cepstral analysis; feature extraction; speech recognition; time-frequency analysis; wavelet transforms; Lombard effect speech; Mel-frequency cepstral coefficients; SUSAS database; acoustic parameters; angry speaking style; high resolution wavelet analysis; loud speaking style; monophone speech recognition; neutral speaking style; stress impact; stressed speech recognition; subband-based cepstral parameters; wavelet packet parameters; Cepstral analysis; Mel frequency cepstral coefficient; Spatial databases; Speech analysis; Speech recognition; Stress; Testing; Vocabulary; Wavelet analysis; Wavelet packets;
Journal_Title :
Signal Processing Letters, IEEE