• DocumentCode
    1394125
  • Title

    GA-based noisy speech recognition using two-dimensional cepstrum

  • Author

    Lin, Chin-Teng ; Nein, Hsi-Wen ; Hwu, Jiing-Yuan

  • Author_Institution
    Dept. of Electr. & Control. Eng., Chiao-Tung Univ., Hsinchu, Taiwan
  • Volume
    8
  • Issue
    6
  • fYear
    2000
  • fDate
    11/1/2000 12:00:00 AM
  • Firstpage
    664
  • Lastpage
    675
  • Abstract
    Among various kinds of speech features, the two-dimensional (2-D) cepstrum (TDC) is a special one, which can simultaneously represent several types of information contained in the speech waveform: static and dynamic features, as well as global and fine frequency structures. Analysis results show that the coefficients located at lower indexes portion of the TDC matrix seem to be more significant than others. Hence, to represent an utterance only some TDC coefficients need to be selected to form a feature vector instead of the sequence of feature vectors. It has the advantages of simple computation and less storage space. However, our experiments show that the selection of TDC coefficients is quite sensitive to background noise. In order to solve this problem, we propose the GA-based M-TDC (modified TDC) method in this paper to improve the representativeness and robustness of the selected TDC coefficients in noisy environments. The M-TDC differs from the standard TDC by the use of filters to remove the noise components. Furthermore, in the GA-based M-TDC method, we apply the genetic algorithms (GAs) to find the robust coefficients in the M-TDC matrix. From the experiments with five noise types, we find that the GA-based M-TDC method has better recognition results than the original TDC approaching noisy environments
  • Keywords
    acoustic noise; cepstral analysis; genetic algorithms; hidden Markov models; speech recognition; GA-based noisy speech recognition; TDC matrix; background noise; dynamic features; feature vector; filters; fine frequency structures; global and fine frequency structures; noise components; simple computation; speech features; speech waveform; static and dynamic features; storage space; two-dimensional cepstrum; utterance; Background noise; Cepstral analysis; Cepstrum; Filters; Frequency; Genetic algorithms; Noise robustness; Speech recognition; Two dimensional displays; Working environment noise;
  • fLanguage
    English
  • Journal_Title
    Speech and Audio Processing, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1063-6676
  • Type

    jour

  • DOI
    10.1109/89.876300
  • Filename
    876300