DocumentCode
1749621
Title
Using phase spectrum information for improved speech recognition performance
Author
Schluter, Ralf ; Ney, Hemann
Author_Institution
Lehrstuhl fur Informatik V1, RWTH Aachen, Germany
Volume
1
fYear
2001
fDate
2001
Firstpage
133
Abstract
New acoustic features for continuous speech recognition based on the short-term Fourier phase spectrum are introduced for mono (telephone) recordings. The new phase based features were combined with standard Mel Frequency Cepstral Coefficients (MFCC), and results were produced with and without using additional linear discriminant analysis (LDA) to choose the most relevant features. Experiments were performed on the SieTill corpus for telephone line recorded German digit strings. Using LDA to combine purely phase based features with MFCCs, we obtained improvements in word error rate of up to 25% relative to using MFCCs alone with the same overall number of parameters in the system
Keywords
acoustic signal processing; cepstral analysis; fast Fourier transforms; feature extraction; speech recognition; MFCC; Mel Frequency Cepstral Coefficients; SieTill corpus; acoustic features; acoustic interference; automatic speech recognition systems; continuous speech recognition; fast Fourier transform; linear discriminant analysis; mono recordings; phase based features; phase spectrum information; short-term Fourier phase spectrum; speech recognition performance; telephone line recorded German digit strings; telephone recordings; word error rate; Auditory system; Cepstral analysis; Humans; Linear discriminant analysis; MONOS devices; Mel frequency cepstral coefficient; Signal analysis; Speech analysis; Speech recognition; Telephony;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics, Speech, and Signal Processing, 2001. Proceedings. (ICASSP '01). 2001 IEEE International Conference on
Conference_Location
Salt Lake City, UT
ISSN
1520-6149
Print_ISBN
0-7803-7041-4
Type
conf
DOI
10.1109/ICASSP.2001.940785
Filename
940785
Link To Document