• DocumentCode
    3703343
  • Title

    Exploring dataset similarities using PCA-based feature selection

  • Author

    Ingo Siegert;Ronald B?ck;Andreas Wendemuth;Bogdan Vlasenko

  • Author_Institution
    Cognitive Systems Group, Otto von Guericke University Magdeburg, Germany
  • fYear
    2015
  • Firstpage
    387
  • Lastpage
    393
  • Abstract
    In emotion recognition from speech, several well-established corpora are used to date for the development of classification engines. The data is annotated differently, and the community in the field uses a variety of feature extraction schemes. The aim of this paper is to investigate promising features for individual corpora and then compare the results for proposing optimal features across data sets, introducing a new ranking method. Further, this enables us to present a method for automatic identification of groups of corpora with similar characteristics. This answers an urgent question in classifier development, namely whether data from different corpora is similar enough to jointly be used as training material, overcoming shortage of material in matching domains. We compare the results of this method with manual groupings of corpora. We consider the established emotional speech corpora AVIC, ABC, DES, EMO-DB, ENTERFACE, SAL, SMARTKOM, SUSAS and VAM, however our approach is general.
  • Keywords
    "Speech","Feature extraction","Principal component analysis","Databases","Speech recognition","Noise measurement","Stress"
  • Publisher
    ieee
  • Conference_Titel
    Affective Computing and Intelligent Interaction (ACII), 2015 International Conference on
  • Electronic_ISBN
    2156-8111
  • Type

    conf

  • DOI
    10.1109/ACII.2015.7344600
  • Filename
    7344600