• DocumentCode
    3425645
  • Title

    Localized spectro-temporal cepstral analysis of speech

  • Author

    Bouvrie, Jake ; Ezzat, Tony ; Poggio, Tomaso

  • Author_Institution
    Center for Biol. & Comput. Learning, Massachusetts Inst. of Technol., Cambridge, MA
  • fYear
    2008
  • fDate
    March 31 2008-April 4 2008
  • Firstpage
    4733
  • Lastpage
    4736
  • Abstract
    Drawing on recent progress in auditory neuroscience, we present a novel speech feature analysis technique based on localized spectro- temporal cepstral analysis of speech. We proceed by extracting localized 2D patches from the spectrogram and project onto a 2D discrete cosine (2D-DCT) basis. For each time frame, a speech feature vector is then formed by concatenating low-order 2D- DCT coefficients from the set of corresponding patches. We argue that our framework has significant advantages over standard one- dimensional MFCC features. In particular, we find that our features are more robust to noise, and better capture temporal modulations important for recognizing plosive sounds. We evaluate the performance of the proposed features on a TIMIT classification task in clean, pink, and babble noise conditions, and show that our feature analysis outperforms traditional features based on MFCCs.
  • Keywords
    discrete cosine transforms; pattern classification; speech processing; speech recognition; 1D MFCC features; auditory neuroscience; discrete cosine transform; localized 2D patches; pattern classification; spectrotemporal cepstral analysis; speech feature vector; speech processing; 1f noise; Acoustic noise; Cepstral analysis; Discrete cosine transforms; Mel frequency cepstral coefficient; Neuroscience; Noise robustness; Performance analysis; Spectrogram; Speech analysis; Cepstral analysis; Nervous system; Speech processing; Speech recognition;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing, 2008. ICASSP 2008. IEEE International Conference on
  • Conference_Location
    Las Vegas, NV
  • ISSN
    1520-6149
  • Print_ISBN
    978-1-4244-1483-3
  • Electronic_ISBN
    1520-6149
  • Type

    conf

  • DOI
    10.1109/ICASSP.2008.4518714
  • Filename
    4518714