DocumentCode :
3059014
Title :
Voice Conversion by Prosody and Vocal Tract Modification
Author :
Rao, K. Sreenivasa ; Yegnanarayana, B.
Author_Institution :
Indian Inst. of Technol. Guwahati, Guwahati
fYear :
2006
fDate :
18-21 Dec. 2006
Firstpage :
111
Lastpage :
116
Abstract :
In this paper we proposed some flexible methods, which are useful in the process of voice conversion. The proposed methods modify the shape of the vocal tract system and the characteristics of the prosody according to the desired requirement. The shape of the vocal tract system is modified by shifting the major resonant frequencies (formants) of the short term spectrum, and altering their band- widths accordingly. In the case of prosody modification, the required durational and intonational characteristics are imposed on the given speech signal. In the proposed method, the prosodic characteristics are manipulated using instants of significant excitation. The instants of significant excitation correspond to the instants of glottal closure (epochs) in the case of voiced speech, and to some random excitations like onset of burst in the case of nonvoiced speech. Instants of significant excitation are computed from the linear prediction (LP) residual of the speech signals by using the property of average group delay of minimum phase signals. The manipulations of durational characteristics and pitch contour (intonation pattern) are achieved by manipulating the LP residual with the help of the knowledge of the instants of significant excitation. The modified LP residual is used to excite the time varying filter. The filter parameters are updated according to the desired vocal tract characteristics. The proposed methods are evaluated using listening tests.
Keywords :
linear predictive coding; speech coding; durational characteristics; epochs; glottal closure; intonation pattern; intonational characteristics; linear prediction; pitch contour; prosody modification; speech signal; time varying filter; vocal tract modification; voice conversion; Artificial neural networks; Bandwidth; Delay; Filters; Frequency estimation; Resonant frequency; Shape; Speech synthesis; TV; Testing;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Information Technology, 2006. ICIT '06. 9th International Conference on
Conference_Location :
Bhubaneswar
Print_ISBN :
0-7695-2635-7
Type :
conf
DOI :
10.1109/ICIT.2006.92
Filename :
4273166
Link To Document :
بازگشت