مرکز منطقه ای اطلاع رساني علوم و فناوري - GA-based noisy speech recognition using two-dimensional cepstrum

DocumentCode :

1394125

Title :

GA-based noisy speech recognition using two-dimensional cepstrum

Author :

Lin, Chin-Teng ; Nein, Hsi-Wen ; Hwu, Jiing-Yuan

Author_Institution :

Dept. of Electr. & Control. Eng., Chiao-Tung Univ., Hsinchu, Taiwan

Volume :

Issue :

fYear :

2000

fDate :

11/1/2000 12:00:00 AM

Firstpage :

664

Lastpage :

675

Abstract :

Among various kinds of speech features, the two-dimensional (2-D) cepstrum (TDC) is a special one, which can simultaneously represent several types of information contained in the speech waveform: static and dynamic features, as well as global and fine frequency structures. Analysis results show that the coefficients located at lower indexes portion of the TDC matrix seem to be more significant than others. Hence, to represent an utterance only some TDC coefficients need to be selected to form a feature vector instead of the sequence of feature vectors. It has the advantages of simple computation and less storage space. However, our experiments show that the selection of TDC coefficients is quite sensitive to background noise. In order to solve this problem, we propose the GA-based M-TDC (modified TDC) method in this paper to improve the representativeness and robustness of the selected TDC coefficients in noisy environments. The M-TDC differs from the standard TDC by the use of filters to remove the noise components. Furthermore, in the GA-based M-TDC method, we apply the genetic algorithms (GAs) to find the robust coefficients in the M-TDC matrix. From the experiments with five noise types, we find that the GA-based M-TDC method has better recognition results than the original TDC approaching noisy environments

Keywords :

acoustic noise; cepstral analysis; genetic algorithms; hidden Markov models; speech recognition; GA-based noisy speech recognition; TDC matrix; background noise; dynamic features; feature vector; filters; fine frequency structures; global and fine frequency structures; noise components; simple computation; speech features; speech waveform; static and dynamic features; storage space; two-dimensional cepstrum; utterance; Background noise; Cepstral analysis; Cepstrum; Filters; Frequency; Genetic algorithms; Noise robustness; Speech recognition; Two dimensional displays; Working environment noise;

fLanguage :

English

Journal_Title :

Speech and Audio Processing, IEEE Transactions on

Publisher :

ieee

ISSN :

1063-6676

Type :

jour

DOI :

10.1109/89.876300

Filename :

876300

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=1394125