• DocumentCode
    3599820
  • Title

    Joined cepstral distance features two-stage multi-class classification for emotional speech

  • Author

    Changqin Quan ; Bin Zhang ; Ren, Fuji

  • Author_Institution
    Hefei Univ. of Technol., Hefei, China
  • fYear
    2014
  • Firstpage
    91
  • Lastpage
    96
  • Abstract
    This letter presents a joined cepstral distance and voice quality feature two-stage multi-class classification with DAG-SVM for emotional speech. The Harmonic to Noise Ratio (HNR) is applied to detect the throat diseases because it can reflect characteristics of the throat. Meanwhile, these characteristics are also strong emotional basis to distinguish emotion in speech. The cepstrum and cepstral distance is able to measure differences as well, which are well used for endpoint detecting in speech signals. In this work, cepstral distance is used for measuring the similarity between frames in emotional statement and in neutral signals. The experiment shows that cepstral distance can increase the recognition rate of emotion sad, and can balance the rate of other classes of emotion except angry. Finally, aiming at the characteristics that the different emotional expression ability of these feature set is different, a two-state classification is applied to solve confusion in multi-emotion recognition. In the recognition, Chinese mandarin emotion database is used and a large training set (1134+378 utterances) ensures a powerful modeling capability for predicting emotion.
  • Keywords
    diseases; emotion recognition; medical signal processing; patient diagnosis; signal classification; signal detection; speech processing; support vector machines; Chinese mandarin emotion database; DAG-SVM; emotional speech; emotional statement; harmonic to noise ratio; joined cepstral distance; large training set; multiemotion recognition; neutral signals; speech signal detection; throat disease detection; two-stage multi-class classification; two-stage multiclass classification; two-state classification; voice quality feature; Accuracy; Emotion recognition; Feature extraction; Mel frequency cepstral coefficient; Speech; Speech recognition; Cepstral distance; Emotional speech recognition; HNR; PCA; two-stage classification;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Cloud Computing and Intelligence Systems (CCIS), 2014 IEEE 3rd International Conference on
  • Print_ISBN
    978-1-4799-4720-1
  • Type

    conf

  • DOI
    10.1109/CCIS.2014.7175709
  • Filename
    7175709