A personalized emotion recognition system using an unsupervised feature adaptation scheme

Author

Rahman, Tauhidur ; Busso, Carlos

Author_Institution

Dept. of Electr. Eng., Univ. of Texas at Dallas, Dallas, TX, USA

fYear

2012

fDate

25-30 March 2012

Firstpage

5117

Lastpage

5120

Abstract

A personalized emotion recognition system aims to tune the model to recognize the expressive behaviors of a targeted person. Such a system can play an important role in various domains including call center and health care applications. Adapting any general emotion recognition system for a particular individual requires speech samples and prior knowledge about their emotional content. These assumptions constrain the use of these techniques in many real scenarios in which no annotated data is available to train or adapt the models. To address this problem, this paper introduces an unsupervised feature adaptation scheme that aims to reduce the mismatch between the acoustic features used to train the system and the acoustic features extracted from the unknown targeted speaker. The adaptation scheme uses our recently proposed iterative feature normalization (IFN) framework. An emotion detection system is trained with the IEMOCAP database. For testing, a database was created by downloading videos from a video-sharing website, containing various interviews from a targeted subject (1.5 hours). The detection system is used to identify emotional speech with and without the proposed feature adaptation scheme. The experimental results indicate that the proposed approach improves the unweighted accuracy from 50.8% to 70.0%.

Keywords

emotion recognition; feature extraction; iterative methods; speech recognition; unsupervised learning; IEMOCAP database; IFN framework; acoustic feature extraction; acoustic features; emotional speech detection system; iterative feature normalization framework; personalized emotion recognition system; speech samples; unsupervised feature adaptation scheme; video-sharing Website; Accuracy; Acoustics; Databases; Emotion recognition; Feature extraction; Speech; Testing; Personalized emotion recognition; feature adaptation; front-end feature normalization;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics, Speech and Signal Processing (ICASSP), 2012 IEEE International Conference on

Conference_Location

Kyoto

ISSN

1520-6149

Print_ISBN

978-1-4673-0045-2

Electronic_ISBN

1520-6149

Type

conf

DOI

10.1109/ICASSP.2012.6289072

Filename

6289072