Emotion recognition from speech based on relevant feature and majority voting

Author

Sarker, Md Kamruzzaman ; Alam, Kazi Md Rokibul ; Arifuzzaman, M.

Author_Institution

Dept. of Comput. Sci. & Eng., Khulna Univ. of Eng. & Technol., Khulna, Bangladesh

fYear

2014

fDate

23-24 May 2014

Firstpage

1

Lastpage

5

Abstract

This paper proposes an approach to detect emotion from human speech employing majority voting technique over several machine learning techniques. The contribution of this work is in two folds: firstly it selects those features of speech which is most promising for classification and secondly it uses the majority voting technique that selects the exact class of emotion. Here, majority voting technique has been applied over Neural Network (NN), Decision Tree (DT), Support Vector Machine (SVM) and K-Nearest Neighbor (KNN). Input vector of NN, DT, SVM and KNN consists of various acoustic and prosodic features like Pitch, Mel-Frequency Cepstral coefficients etc. From speech signal many feature have been extracted and only promising features have been selected. To consider a feature as promising, Fast Correlation based feature selection (FCBF) and Fisher score algorithms have been used and only those features are selected which are highly ranked by both of them. The proposed approach has been tested on Berlin dataset of emotional speech [3] and Electromagnetic Articulography (EMA) dataset [4]. The experimental result shows that majority voting technique attains better accuracy over individual machine learning techniques. The employment of the proposed approach can effectively recognize the emotion of human beings in case of social robot, intelligent chat client, call-center of a company etc.

Keywords

decision trees; emotion recognition; feature selection; learning (artificial intelligence); neural nets; signal classification; speech recognition; support vector machines; Berlin dataset; DT; EMA dataset; FCBF; Fisher score algorithms; KNN; NN; SVM; decision tree; electromagnetic articulography dataset; emotion recognition; fast correlation based feature selection; human speech; k-nearest neighbor; machine learning technique; majority voting technique; neural network; speech classification; support vector machine; Accuracy; Emotion recognition; Feature extraction; Mel frequency cepstral coefficient; Speech; Speech recognition; Support vector machines; Component Emotion recognition; decision tree; k-nearest neighbor; majority voting technique; neural network; support vector machine;

fLanguage

English

Publisher

ieee

Conference_Titel

Informatics, Electronics & Vision (ICIEV), 2014 International Conference on

Conference_Location

Dhaka

Print_ISBN

978-1-4799-5179-6

Type

conf

DOI

10.1109/ICIEV.2014.6850685

Filename

6850685