مرکز منطقه ای اطلاع رساني علوم و فناوري - Iterative Feature Normalization Scheme for Automatic Emotion Detection from Speech

DocumentCode :

63983

Title :

Iterative Feature Normalization Scheme for Automatic Emotion Detection from Speech

Author :

Busso, Carlos ; Mariooryad, S. ; Metallinou, Angeliki ; Narayanan, Shrikanth

Author_Institution :

Erik Jonsson Sch. of Eng. & Comput. Sci., Univ. of Texas at Dallas, Richardson, TX, USA

Volume :

Issue :

fYear :

2013

fDate :

Oct.-Dec. 2013

Firstpage :

386

Lastpage :

397

Abstract :

The externalization of emotion is intrinsically speaker-dependent. A robust emotion recognition system should be able to compensate for these differences across speakers. A natural approach is to normalize the features before training the classifiers. However, the normalization scheme should not affect the acoustic differences between emotional classes. This study presents the iterative feature normalization (IFN) framework, which is an unsupervised front-end, especially designed for emotion detection. The IFN approach aims to reduce the acoustic differences, between the neutral speech across speakers, while preserving the inter-emotional variability in expressive speech. This goal is achieved by iteratively detecting neutral speech for each speaker, and using this subset to estimate the feature normalization parameters. Then, an affine transformation is applied to both neutral and emotional speech. This process is repeated till the results from the emotion detection system are consistent between consecutive iterations. The IFN approach is exhaustively evaluated using the IEMOCAP database and a data set obtained under free uncontrolled recording conditions with different evaluation configurations. The results show that the systems trained with the IFN approach achieve better performance than systems trained either without normalization or with global normalization.

Keywords :

emotion recognition; speech recognition; IEMOCAP database; IFN; acoustic differences; automatic emotion detection; emotion externalization; emotional classes; emotional speech; expressive speech; global normalization; interemotional variability; iterative feature normalization scheme; neutral speech; robust emotion recognition system; speakers; uncontrolled recording conditions; unsupervised front-end; Acoustics; Databases; Emotion recognition; Feature extraction; Robustness; Speech; Training; Emotion recognition; emotion; features normalization; speaker normalization;

fLanguage :

English

Journal_Title :

Affective Computing, IEEE Transactions on

Publisher :

ieee

ISSN :

1949-3045

Type :

jour

DOI :

10.1109/T-AFFC.2013.26

Filename :

6645371

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=63983