مرکز منطقه ای اطلاع رساني علوم و فناوري - Transformation of Speaker Characteristics in Speech Using Support Vector Machines

DocumentCode :

2712152

Title :

Transformation of Speaker Characteristics in Speech Using Support Vector Machines

Author :

Rao, K. Sreenivasa ; Koolagudi, Shashidhar G.

Author_Institution :

Indian Inst. of Technol. Kharagpur, Kharagpur

fYear :

2007

fDate :

18-21 Dec. 2007

Firstpage :

660

Lastpage :

665

Abstract :

In this paper we propose support vector machines (S VM) for transforming the speaker characteristics of the speech. Speaker characteristics are mainly influenced by the behavioural characteristics (prosody) of the speaker, characteristics of the vocal tract system and the excitation source. In this work speaker transformation indicates, modifying the speaker characteristics of the speech according to the desired speaker, and preserving the underlying message (sequence of sound units, i.e., text) same as in the original speech. This is performed by deriving the mapping functions for transforming the vocal tract characteristics and prosodic characteristics. SVMs are explored for deriving these mapping functions. The prosodic parameters and the characteristics of the vocal tract system and the excitation source of the target speaker are obtained from the output of the mapping functions. The manipulations of the prosodic parameters (durational characteristics, pitch contour (intonation pattern) and intensity patterns) are achieved by manipulating the linear prediction (LP) residual with the help of the knowledge of the instants of significant excitation. The modified LP residual is used to excite the time varying filter. The filter parameters are updated according to the desired vocal tract characteristics. The target speaker´s speech is synthesized and evaluated using listening tests. The results of the listening tests indicate that the proposed mapping functions using SVMs provide the better speaker transformation compared to the earlier methods proposed by the author.

Keywords :

speaker recognition; support vector machines; time-varying filters; linear prediction residual; mapping functions; prosodic parameters; speaker characteristics; speaker transformation; support vector machines; time varying filter; vocal tract system; Artificial neural networks; Filters; Frequency estimation; Humans; Loudspeakers; Natural languages; Signal synthesis; Speech synthesis; Support vector machines; Testing;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Advanced Computing and Communications, 2007. ADCOM 2007. International Conference on

Conference_Location :

Guwahati, Assam

Print_ISBN :

0-7695-3059-1

Type :

conf

DOI :

10.1109/ADCOM.2007.47

Filename :

4426043

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2712152