مرکز منطقه ای اطلاع رساني علوم و فناوري - Speaker intonation adaptation for transforming text-to-speech synthesis speaker identity

DocumentCode :

3744833

Title :

Speaker intonation adaptation for transforming text-to-speech synthesis speaker identity

Author :

Mahsa Sadat Elyasi Langarani;Jan van Santen

Author_Institution :

Center for Spoken Language Understanding, Oregon Health & Science University, Portland, OR, USA

fYear :

2015

Firstpage :

116

Lastpage :

123

Abstract :

In this study, we propose a new intonation adaptation method to transform the perceived identity of a Text-To-Speech system to that of a target speaker with a small amount of training data. In the proposed method, during training we fit parametrized accent and phrase curves to parallel recordings of the target speaker F0 curves, and estimate the parameters of a mapping between the corresponding parameter spaces. During test, we fit the accent and phrase curves to the source utterances, apply the mapping, and create an F0 contour from the mapped accent and phrase curves. We compare the proposed method with a baseline adaptation method in which the source F0 contour is transformed linearly such that the per-utterance mean and variance of the target F0 contour is left unaltered. Perceptual tests showed that the proposed method was better than the baseline method in two subjective tests that assess similarity to the target speaker and speech quality, respectively.

Keywords :

"Mathematical model","Hidden Markov models","Adaptation models","Speech","Feature extraction","Training","Transforms"

Publisher :

ieee

Conference_Titel :

Automatic Speech Recognition and Understanding (ASRU), 2015 IEEE Workshop on

Type :

conf

DOI :

10.1109/ASRU.2015.7404783

Filename :

7404783

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3744833