• DocumentCode
    134271
  • Title

    Speech emotion classification using acoustic features

  • Author

    Shizhe Chen ; Qin Jin ; Xirong Li ; Gang Yang ; Jieping Xu

  • Author_Institution
    Multimedia Comput. Lab., Renmin Univ. of China, Beijing, China
  • fYear
    2014
  • fDate
    12-14 Sept. 2014
  • Firstpage
    579
  • Lastpage
    583
  • Abstract
    Emotion recognition from speech is a challenging research area with wide applications. In this paper we explore one of the key aspects of building an emotion recognition system: generating suitable feature representation. We extract features from four angles: (1) low-level acoustic features such as intensity, F0, jitter, shimmer and spectral contours etc. and statistical functions over these features, (2) a set of features derived from segmental cepstral-based features scored against emotion-dependent Gaussian mixture models, (3) a set of features derived from a set of low-level acoustic codewords and (4) GMM supervectors constructed by stacking the means or covariance or weights of the adapted mixture components on each utterance. We apply these features for emotion recognition independently and jointly and compare their performance within this task. We build a support vector machine (SVM) classifier based on these features on the IEMOCAP database. The four-class emotion recognition accuracy of 71.9% of our system outperforms the previously reported best results on this dataset.
  • Keywords
    Gaussian processes; acoustic signal processing; cepstral analysis; emotion recognition; feature extraction; mixture models; signal classification; speech recognition; support vector machines; GMM supervectors; IEMOCAP database; SVM classifier; emotion recognition system; emotion-dependent Gaussian mixture models; feature extraction; feature representation; four-class emotion recognition accuracy; low-level acoustic codewords; low-level acoustic features; segmental cepstral-based features; speech emotion classification; statistical functions; support vector machine; Accuracy; Acoustics; Emotion recognition; Feature extraction; Speech; Speech recognition; Support vector machines; Acoustic features; Emotion recognition; Support vector machine;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Chinese Spoken Language Processing (ISCSLP), 2014 9th International Symposium on
  • Conference_Location
    Singapore
  • Type

    conf

  • DOI
    10.1109/ISCSLP.2014.6936664
  • Filename
    6936664