• DocumentCode
    3268771
  • Title

    Time-frequency feature extraction from spectrograms and wavelet packets with application to automatic stress and emotion classification in speech

  • Author

    He, Ling ; Lech, Margaret ; Maddage, Namunu C. ; Allen, Nicholas B.

  • Author_Institution
    Sch. of Electr. & Comput. Eng., RMIT Univ., Melbourne, VIC, Australia
  • fYear
    2009
  • fDate
    8-10 Dec. 2009
  • Firstpage
    1
  • Lastpage
    5
  • Abstract
    Three new methods of feature extraction based on time-frequency analysis of speech are presented and compared. In the first approach, speech spectrograms were passed through a bank of 12 log-Gabor filters and the outputs are averaged. In the second approach, the spectrograms were sub-divided into ERB frequency bands and the average energy for each band is calculated. In the third approach, wavelet packet arrays were calculated and passed through a bank of 12 log-Gabor filters and averaged. The feature extraction methods were tested in the process of automatic stress and emotion classification. The feature distributions were modeled and classified using a Gaussian mixture model. The test samples included single vowels, words and sentences from the SUSAS data base with 3 classes of stress, and spontaneous speech recordings with 5 emotional classes from the ORI data base. The classification results showed correct classification rates ranging from 64.70% to 84.85%, for different SUSAS data sets and from 39.6% to 53.4% for the ORI data base.
  • Keywords
    Gabor filters; Gaussian processes; emotion recognition; speech processing; speech recognition; ERB frequency band; Gabor filter; Gaussian mixture model; emotion classification; speech classification; speech spectrogram; stress classification; time-frequency feature extraction; wavelet packet array; Automatic speech recognition; Emotion recognition; Feature extraction; Hidden Markov models; Human factors; Spectrogram; Speech analysis; Stress; Time frequency analysis; Wavelet packets; spectrogram; speech classification; stress and emotion recognition; time-frequency analysis; wavelet packets;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Information, Communications and Signal Processing, 2009. ICICS 2009. 7th International Conference on
  • Conference_Location
    Macau
  • Print_ISBN
    978-1-4244-4656-8
  • Electronic_ISBN
    978-1-4244-4657-5
  • Type

    conf

  • DOI
    10.1109/ICICS.2009.5397513
  • Filename
    5397513