• DocumentCode
    3390165
  • Title

    A novel spectro-temporal feature extraction method for phoneme classification

  • Author

    Fartash, Mehdi ; Setayeshi, Saeed ; Razzazi, Farbod

  • Author_Institution
    Dept. of Comput. Eng., Islamic Azad Univ., Tehran, Iran
  • fYear
    2010
  • fDate
    24-28 Oct. 2010
  • Firstpage
    569
  • Lastpage
    572
  • Abstract
    In this paper, we propose a new type of feature extraction method inspired by the model of auditory cortical processing. The output of the cortical model is a 4-D spectro-temporal representation of the sound that each point of this space indicates the amount of energy at the corresponding time, frequency, rate and scale. In the proposed model, one proper rate and one proper scale are selected among the rates and scales. Therefore, the output of the cortical model decreases the dimensions from a 4-D space to a 2-D space. In most ASR systems, HMM classifier model is used to solve the variable length problem after a framing procedure which affects the feature extraction stage and it causes to spoil the temporal information of the phoneme signal in the features level. In the proposed model, this problem is handled in the feature extraction stage. In this paper, some fixed length features are achieved by the analysis of spectro-temporal space for each phoneme. Since the provided feature has a fixed-dimension, we use a classical classifier as support vector machine for a phoneme classification task. In order to evaluate the performance of the proposed model, we performed a phoneme classification task on seven subset of the TMIT corpus. The phoneme classification results achieved on consonants and vowels showed the average performance improvement of 5.15% and 9.65% relative to the HMM-MFCC +AMFCC approach. In addition, the average improvements are 8.7% and 2.68% relative to the SVM-MFCC approach, respectively.
  • Keywords
    audio signal processing; feature extraction; signal classification; speech processing; support vector machines; auditory cortical processing; phoneme classification; spectro-temporal feature extraction; support vector machine; Brain modeling; Computational modeling; Feature extraction; Hidden Markov models; Mel frequency cepstral coefficient; Spectrogram; Support vector machines; auditory model; feature extraction; phoneme classification; spectro-temporal analysis;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Signal Processing (ICSP), 2010 IEEE 10th International Conference on
  • Conference_Location
    Beijing
  • Print_ISBN
    978-1-4244-5897-4
  • Type

    conf

  • DOI
    10.1109/ICOSP.2010.5655038
  • Filename
    5655038