Formant-based feature extraction for emotion classification from speech

Author

Jonathan C. Kim;Mark A. Clements

Author_Institution

Georgia Institute of Technology, Atlanta, GA 30332 USA

fYear

2015

fDate

7/1/2015 12:00:00 AM

Firstpage

477

Lastpage

481

Abstract

In a previous study, a robust formant-tracking algorithm was introduced to model formant and spectral properties of speech. The algorithm utilizes Gaussian mixtures to estimate spectral parameters, and refines the estimates by using a maximum a posteriori adaptation (MAP) algorithm. In this paper, the formant-tracking algorithm was used to extract the formant-based features for emotion classification. The classification results were compared to a linear predictive coding (LPC) based algorithm for evaluation. On average, the formant features extracted using the algorithm improved the unweighted accuracy by 2.1 percentage points when compared to a LPC-based algorithm. The combination of formant features and other acoustic features statistically significantly improved the unweighted accuracy by 2.7 percentage points, whereas the LPC-based features barely improved it by 1 percentage point. The results clearly indicate that an improved formant-tracking method improved emotion classification accuracy. The effect of formant-based features in emotion classification is also discussed.

Keywords

"Feature extraction","Speech","Support vector machines","Accuracy","Prediction algorithms","Bandwidth","Acoustics"

Publisher

ieee

Conference_Titel

Telecommunications and Signal Processing (TSP), 2015 38th International Conference on

Type

conf

DOI

10.1109/TSP.2015.7296308

Filename

7296308