Title :
Using phase spectrum information for improved speech recognition performance
Author :
Schluter, Ralf ; Ney, Hemann
Author_Institution :
Lehrstuhl fur Informatik V1, RWTH Aachen, Germany
Abstract :
New acoustic features for continuous speech recognition based on the short-term Fourier phase spectrum are introduced for mono (telephone) recordings. The new phase based features were combined with standard Mel Frequency Cepstral Coefficients (MFCC), and results were produced with and without using additional linear discriminant analysis (LDA) to choose the most relevant features. Experiments were performed on the SieTill corpus for telephone line recorded German digit strings. Using LDA to combine purely phase based features with MFCCs, we obtained improvements in word error rate of up to 25% relative to using MFCCs alone with the same overall number of parameters in the system
Keywords :
acoustic signal processing; cepstral analysis; fast Fourier transforms; feature extraction; speech recognition; MFCC; Mel Frequency Cepstral Coefficients; SieTill corpus; acoustic features; acoustic interference; automatic speech recognition systems; continuous speech recognition; fast Fourier transform; linear discriminant analysis; mono recordings; phase based features; phase spectrum information; short-term Fourier phase spectrum; speech recognition performance; telephone line recorded German digit strings; telephone recordings; word error rate; Auditory system; Cepstral analysis; Humans; Linear discriminant analysis; MONOS devices; Mel frequency cepstral coefficient; Signal analysis; Speech analysis; Speech recognition; Telephony;
Conference_Titel :
Acoustics, Speech, and Signal Processing, 2001. Proceedings. (ICASSP '01). 2001 IEEE International Conference on
Conference_Location :
Salt Lake City, UT
Print_ISBN :
0-7803-7041-4
DOI :
10.1109/ICASSP.2001.940785