• DocumentCode
    134329
  • Title

    Speech based emotion recognition using spectral feature extraction and an ensemble of kNN classifiers

  • Author

    Rieger, Steven A. ; Muraleedharan, Rajani ; Ramachandran, Ravi P.

  • Author_Institution
    Electr. & Comput. Eng., Rowan Univ., Glassboro, NJ, USA
  • fYear
    2014
  • fDate
    12-14 Sept. 2014
  • Firstpage
    589
  • Lastpage
    593
  • Abstract
    Security (and cyber security) is an important issue in existing and developing technology. It is imperative that cyber security go beyond password based systems to avoid criminal activities. A human biometric and emotion based recognition framework implemented in parallel can enable applications to access personal or public information securely. The focus of this paper is on the study of speech based emotion recognition using a pattern recognition paradigm with spectral feature extraction and an ensemble of k nearest neighbor (kNN) classifiers. The five spectral features are the linear predictive cepstrum (CEP), mel frequency cepstrum (MFCC), line spectral frequencies (LSF), adaptive component weighted cepstrum (ACW) and the post-filter cepstrum (PFL). The bagging algorithm is used to train the ensemble of kNNs. Fusion is implicitly accomplished by ensemble classification. The LDC emotional prosody speech database is used in all the experiments. Results show that the maximum gain in performance is achieved by using two kNNs as opposed to using a single kNN.
  • Keywords
    emotion recognition; feature extraction; learning (artificial intelligence); security of data; signal classification; speech recognition; ACW feature; CEP feature; LDC emotional prosody speech database; LSF feature; MFCC feature; Mel frequency cepstrum; PFL feature; adaptive component weighted cepstrum; bagging algorithm; biometric based recognition framework; cyber security; emotion based recognition framework; k-nearest neighbor; kNN classifier ensemble; line spectral frequencies; linear predictive cepstrum; password based system; pattern recognition paradigm; personal information; post-filter cepstrum; public information; spectral feature extraction; speech based emotion recognition; Cepstrum; Emotion recognition; Mel frequency cepstral coefficient; Speech; Speech recognition; Training; Vectors; cyber security; emotion recognition; ensemble kNN classifier; fusion; machine learning;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Chinese Spoken Language Processing (ISCSLP), 2014 9th International Symposium on
  • Conference_Location
    Singapore
  • Type

    conf

  • DOI
    10.1109/ISCSLP.2014.6936711
  • Filename
    6936711