Title :
Multi-level Speech Emotion Recognition Based on HMM and ANN
Author :
Mao, Xia ; Chen, Lijiang ; Fu, Liqin
Author_Institution :
Sch. of Electron. & Inf. Eng., Beihang Univ., Beijing, China
fDate :
March 31 2009-April 2 2009
Abstract :
This paper proposes a new approach for emotion recognition based on a hybrid of hidden Markov models (HMMs) and artificial neural network (ANN), using both utterance and segment level information from speech. To combine the advantage on capability to dynamic time warping of HMMs and pattern recognition of ANN, the utterance is viewed as a series of voiced segments, and feature vectors extracted from the segments are normalized into fixed coefficients using orthogonal polynomials methods, and then, distortions are calculated as an input of ANN. Meanwhile, the utterance as a whole is modeled by HMMs, and likelihood probabilities derived from the HMMs are normalized to be another input of ANN. Adopting Beihang University Database of Emotional Speech (BHUDES) and Berlin database of emotional speech, comparison between isolated HMMs and hybrid of HMMs/ANN proves that the approach introduced in this paper is more effective, and the average recognition rate of five emotion states has reached 81.7%.
Keywords :
emotion recognition; feature extraction; hidden Markov models; neural nets; polynomials; speech recognition; artificial neural network; dynamic time warping; feature vector; hidden Markov model; likelihood probability; multilevel speech emotion recognition; orthogonal polynomials method; pattern recognition; segment level information; Acoustic distortion; Artificial neural networks; Data mining; Emotion recognition; Feature extraction; Hidden Markov models; Pattern recognition; Spatial databases; Speech recognition; Support vector machines; ANN; HMM; multi_level; speech emotion recognition;
Conference_Titel :
Computer Science and Information Engineering, 2009 WRI World Congress on
Conference_Location :
Los Angeles, CA
Print_ISBN :
978-0-7695-3507-4
DOI :
10.1109/CSIE.2009.113