• DocumentCode
    2287008
  • Title

    Speech emotion estimation in 3D space

  • Author

    Wu, Dongrui ; Parsons, Thomas D. ; Mower, Emily ; Narayanan, Shrikanth

  • Author_Institution
    Inst. for Creative Technol., Univ. of Southern California, Marina del Rey, CA, USA
  • fYear
    2010
  • fDate
    19-23 July 2010
  • Firstpage
    737
  • Lastpage
    742
  • Abstract
    Speech processing is an important element of affective computing. Most research in this direction has focused on classifying emotions into a small number of categories. However, numerical representations of emotions in a multi-dimensional space can be more appropriate to reflect the gradient nature of emotion expressions, and can be more convenient in the sense of dealing with a small set of emotion primitives. This paper presents three approaches (robust regression, support vector regression, and locally linear reconstruction) for emotion primitives estimation in 3D space (valence/activation/dominance), and two approaches (average fusion and locally weighted fusion) to fuse the three elementary estimators for better overall recognition accuracy. The three elementary estimators are diverse and complementary because they cover both linear and nonlinear models, and both global and local models. These five approaches are compared with the state-of-the-art estimator on the same spontaneously elicited emotion dataset. Our results show that all of our three elementary estimators are suitable for speech emotion estimation. Moreover, it is possible to boost the estimation performance by fusing them properly since they appear to leverage complementary speech features.
  • Keywords
    emotion recognition; numerical analysis; regression analysis; speech recognition; support vector machines; 3D space; emotion expressions; emotion primitives; locally linear reconstruction; multidimensional space; numerical representation; robust regression; speech emotion estimation; speech processing; support vector regression; Acoustics; Artificial neural networks; Correlation; Estimation; Feature extraction; Speech; Three dimensional displays; 3D emotion space; Affective computing; emotion estimation; emotion recognition; estimator fusion;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Multimedia and Expo (ICME), 2010 IEEE International Conference on
  • Conference_Location
    Suntec City
  • ISSN
    1945-7871
  • Print_ISBN
    978-1-4244-7491-2
  • Type

    conf

  • DOI
    10.1109/ICME.2010.5583101
  • Filename
    5583101