SVM based speaker emotion recognition in continuous scale

Author

Hric, Martin ; Chmulik, Michal ; Guoth, Igor ; Jarina, Roman

Author_Institution

Dept. of Telecommun. & Multimedia, Univ. of Zilina, Zilina, Slovakia

fYear

2015

fDate

21-22 April 2015

Firstpage

339

Lastpage

342

Abstract

In this paper we propose a system of speaker emotion recognition based on the SVM regression. Recognized emotional state is expressed in continuous scale in three dimensions: valence, activation and dominance. Experiments have been performed on the IEMOCAP database that contains 6 basic emotions supplemented with 3 additional emotions. Audio recordings from the corpus were divided into voiced and unvoiced segments, and for both types, a vast collection of diverse audio features (830/710) were extracted. Then 40 features for each type of segment were selected by Particle Swarm Optimization. Classification accuracy is expressed by cross-correlation coefficients between the estimated (by the propose system) and real (assigned according to human judgements) emotional state labels. Experiments conducted over dataset show very promising results for the future experiments.

Keywords

particle swarm optimisation; speaker recognition; support vector machines; IEMOCAP database; SVM regression; cross-correlation coefficients; particle swarm optimization; speaker emotion recognition; Accuracy; Correlation; Emotion recognition; Feature extraction; Speech; Speech recognition; Support vector machines;

fLanguage

English

Publisher

ieee

Conference_Titel

Radioelektronika (RADIOELEKTRONIKA), 2015 25th International Conference

Conference_Location

Pardubice

Print_ISBN

978-1-4799-8117-5

Type

conf

DOI

10.1109/RADIOELEK.2015.7129063

Filename

7129063