DocumentCode :
1394125
Title :
GA-based noisy speech recognition using two-dimensional cepstrum
Author :
Lin, Chin-Teng ; Nein, Hsi-Wen ; Hwu, Jiing-Yuan
Author_Institution :
Dept. of Electr. & Control. Eng., Chiao-Tung Univ., Hsinchu, Taiwan
Volume :
8
Issue :
6
fYear :
2000
fDate :
11/1/2000 12:00:00 AM
Firstpage :
664
Lastpage :
675
Abstract :
Among various kinds of speech features, the two-dimensional (2-D) cepstrum (TDC) is a special one, which can simultaneously represent several types of information contained in the speech waveform: static and dynamic features, as well as global and fine frequency structures. Analysis results show that the coefficients located at lower indexes portion of the TDC matrix seem to be more significant than others. Hence, to represent an utterance only some TDC coefficients need to be selected to form a feature vector instead of the sequence of feature vectors. It has the advantages of simple computation and less storage space. However, our experiments show that the selection of TDC coefficients is quite sensitive to background noise. In order to solve this problem, we propose the GA-based M-TDC (modified TDC) method in this paper to improve the representativeness and robustness of the selected TDC coefficients in noisy environments. The M-TDC differs from the standard TDC by the use of filters to remove the noise components. Furthermore, in the GA-based M-TDC method, we apply the genetic algorithms (GAs) to find the robust coefficients in the M-TDC matrix. From the experiments with five noise types, we find that the GA-based M-TDC method has better recognition results than the original TDC approaching noisy environments
Keywords :
acoustic noise; cepstral analysis; genetic algorithms; hidden Markov models; speech recognition; GA-based noisy speech recognition; TDC matrix; background noise; dynamic features; feature vector; filters; fine frequency structures; global and fine frequency structures; noise components; simple computation; speech features; speech waveform; static and dynamic features; storage space; two-dimensional cepstrum; utterance; Background noise; Cepstral analysis; Cepstrum; Filters; Frequency; Genetic algorithms; Noise robustness; Speech recognition; Two dimensional displays; Working environment noise;
fLanguage :
English
Journal_Title :
Speech and Audio Processing, IEEE Transactions on
Publisher :
ieee
ISSN :
1063-6676
Type :
jour
DOI :
10.1109/89.876300
Filename :
876300
Link To Document :
بازگشت