Title :
An experimental study of speech emotion recognition based on deep convolutional neural networks
Author :
W. Q. Zheng;J. S. Yu;Y. X. Zou
Author_Institution :
ADSPLAB/ELIP, School of Electronic Computer Engineering, Peking University, Shenzhen, China
Abstract :
Speech emotion recognition (SER) is a challenging task since it is unclear what kind of features are able to reflect the characteristics of human emotion from speech. However, traditional feature extractions perform inconsistently for different emotion recognition tasks. Obviously, different spectrogram provides information reflecting difference emotion. This paper proposes a systematical approach to implement an effectively emotion recognition system based on deep convolution neural networks (DCNNs) using labeled training audio data. Specifically, the log-spectrogram is computed and the principle component analysis (PCA) technique is used to reduce the dimensionality and suppress the interferences. Then the PCA whitened spectrogram is split into non-overlapping segments. The DCNN is constructed to learn the representation of the emotion from the segments with labeled training speech data. Our preliminary experiments show the proposed emotion recognition system based on DCNNs (containing 2 convolution and 2 pooling layers) achieves about 40% classification accuracy. Moreover, it also outperforms the SVM based classification using the hand-crafted acoustic features.
Keywords :
"Speech","Speech recognition","Emotion recognition","Spectrogram","Feature extraction","Principal component analysis","Convolution"
Conference_Titel :
Affective Computing and Intelligent Interaction (ACII), 2015 International Conference on
Electronic_ISBN :
2156-8111
DOI :
10.1109/ACII.2015.7344669