• DocumentCode
    1304684
  • Title

    Robust Multifactor Speech Feature Extraction Based on Gabor Analysis

  • Author

    Wu, Qiang ; Zhang, Liqing ; Shi, Guangchuan

  • Author_Institution
    Dept. of Comput. Sci. & Eng., Shanghai Jiao Tong Univ., Shanghai, China
  • Volume
    19
  • Issue
    4
  • fYear
    2011
  • fDate
    5/1/2011 12:00:00 AM
  • Firstpage
    927
  • Lastpage
    936
  • Abstract
    The performance of speech recognition systems relies on the consistency and adaptation of the speech feature in complex conditions during the training and testing stages. Traditional systems usually perform poorly under adverse noisy conditions and are not applicable to most real world problems. In this paper, we investigate the speech feature extraction problem in a noisy environment and propose a novel approach based on Gabor filtering and tensor factorization. Recent physiological and psychoacoustic experimental results suggest that the localized spectro-temporal features are essential for auditory perception. To explore this property, we represent the speech signal by using a general higher order tensor and employ two-dimensional Gabor functions with different scales and directions to analyze the localized patches of the power spectrogram. Then the Nonnegative Tensor PCA with sparse constraints is proposed to learn the projection matrices from multiple interrelated feature subspaces. The objective of the sparse constraints is to preserve the statistical characteristic of clean speech data by finding projection matrices of speech subspaces and reduce the noise components which have distributions different from those of clean speech. A multifactor analysis method is proposed to extract robust sparse features by processing the data samples in tensor structure. The simulation results indicate that our proposed method is able to improve the speech recognition performance, especially in noisy environments, compared with the traditional speech feature extraction methods.
  • Keywords
    Gabor filters; feature extraction; hearing; speech recognition; tensors; Gabor filtering; auditory perception; noisy environment; nonnegative tensor PCA; power spectrogram; robust multifactor speech feature extraction; spectro-temporal feature; speech recognition system; speech signal; tensor factorization; two-dimensional Gabor function; Acoustic noise; Gabor filtering; auditory perception; feature extraction; speech recognition; tensor factorization;
  • fLanguage
    English
  • Journal_Title
    Audio, Speech, and Language Processing, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1558-7916
  • Type

    jour

  • DOI
    10.1109/TASL.2010.2070495
  • Filename
    5557762