Exploring dataset similarities using PCA-based feature selection

Author

Ingo Siegert;Ronald B?ck;Andreas Wendemuth;Bogdan Vlasenko

Author_Institution

Cognitive Systems Group, Otto von Guericke University Magdeburg, Germany

fYear

2015

Firstpage

387

Lastpage

393

Abstract

In emotion recognition from speech, several well-established corpora are used to date for the development of classification engines. The data is annotated differently, and the community in the field uses a variety of feature extraction schemes. The aim of this paper is to investigate promising features for individual corpora and then compare the results for proposing optimal features across data sets, introducing a new ranking method. Further, this enables us to present a method for automatic identification of groups of corpora with similar characteristics. This answers an urgent question in classifier development, namely whether data from different corpora is similar enough to jointly be used as training material, overcoming shortage of material in matching domains. We compare the results of this method with manual groupings of corpora. We consider the established emotional speech corpora AVIC, ABC, DES, EMO-DB, ENTERFACE, SAL, SMARTKOM, SUSAS and VAM, however our approach is general.

Keywords

"Speech","Feature extraction","Principal component analysis","Databases","Speech recognition","Noise measurement","Stress"

Publisher

ieee

Conference_Titel

Affective Computing and Intelligent Interaction (ACII), 2015 International Conference on

Electronic_ISBN

2156-8111

Type

conf

DOI

10.1109/ACII.2015.7344600

Filename

7344600