Rules Based Feature Modification for Affective Speaker Recognition

Author

Wu, Zhaohui ; Li, Dongdong ; Yang, Yingchun

Author_Institution

Coll. of Comput. Sci. & Technol., Zhejiang Univ., Hangzhou

Volume

1

fYear

2006

fDate

14-19 May 2006

Abstract

One of the largest challenges in speaker recognition applications is dealing with speaker-emotion variability. In this paper, we further investigate the rules based feature modification for robust speaker recognition with emotional speech. Specifically, we learn the rules of prosodic features modification from a small amount of the content matched source-target pairs. Features with emotion information are adapted from the prevalent neutral features by applying the modification rules. The converted features are trained together with the neutral features to build the speaker models. The effects of individual and combined modifications of duration, pitch and amplitude are also studied using EPST dataset recorded by 8 professional actors with 14 kinds of emotion expressiveness. It demonstrates that duration modifications play the most important role; and that, pitch modifications are more effective than amplitude modifications. Promising result with an improved identification rate by 7.83% is achieved compared to the traditional speaker recognition

Keywords

emotion recognition; feature extraction; speaker recognition; content matched source-target pairs; emotional speech; rules based feature modification; speaker recognition; speaker-emotion variability; Authentication; Computer science; Databases; Educational institutions; Feature extraction; Loudspeakers; Robustness; Speaker recognition; Speech analysis; Speech synthesis;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics, Speech and Signal Processing, 2006. ICASSP 2006 Proceedings. 2006 IEEE International Conference on

Conference_Location

Toulouse

ISSN

1520-6149

Print_ISBN

1-4244-0469-X

Type

conf

DOI

10.1109/ICASSP.2006.1660107

Filename

1660107