مرکز منطقه ای اطلاع رساني علوم و فناوري - Efficient Recognition of Human Emotional States from Audio Signals

DocumentCode :

259410

Title :

Efficient Recognition of Human Emotional States from Audio Signals

Author :

Erdem, Ernur Sonat ; Sert, Mustafa

Author_Institution :

Dept. of Comput. Eng., Baskent Univ., Ankara, Turkey

fYear :

2014

fDate :

10-12 Dec. 2014

Firstpage :

139

Lastpage :

142

Abstract :

Automatic recognition of human emotional states is an important task for efficient human-machine communication. Most of existing works focus on the recognition of emotional states using audio signals alone, visual signals alone, or both. Here we propose empirical methods for feature extraction and classifier optimization that consider the temporal aspects of audio signals and introduce our framework to efficiently recognize human emotional states from audio signals. The framework is based on the prediction of input audio clips that are described using representative low-level features. In the experiments, seven (7) discrete emotional states (anger, fear, boredom, disgust, happiness, sadness, and neutral) from EmoDB dataset, are recognized and tested based on nineteen (19) audio features (15 standalone, 4 joint) by using the Support Vector Machine (SVM) classifier. Extensive experiments have been conducted to demonstrate the effect of feature extraction and classifier optimization methods to the recognition accuracy of the emotional states. Our experiments show that, feature extraction and classifier optimization procedures lead to significant improvement of over 11% in emotion recognition. As a result, the overall recognition accuracy achieved for seven emotions in the EmoDB dataset is 83.33% compared to the baseline accuracy of 72.22%.

Keywords :

audio signal processing; emotion recognition; feature extraction; optimisation; signal classification; support vector machines; EmoDB dataset; SVM classifier; anger; audio features; audio signal temporal aspects; boredom; classifier optimization; discrete emotional states; disgust; empirical methods; fear; feature extraction; happiness; human emotional state recognition; human-machine communication; input audio clip prediction; neutral; representative low-level features; sadness; support vector machine classifier; visual signals; Accuracy; Emotion recognition; Feature extraction; Joints; Mel frequency cepstral coefficient; Support vector machines; Vectors; audio based emotion recognition; affective co;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Multimedia (ISM), 2014 IEEE International Symposium on

Conference_Location :

Taichung

Print_ISBN :

978-1-4799-4312-8

Type :

conf

DOI :

10.1109/ISM.2014.81

Filename :

7033010

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=259410