DocumentCode
705259
Title
Flexible voice morphing based on linear combination of multi-speakers´ vocal tract area functions
Author
Nambu, Yoshiki ; Mikawa, Masahiko ; Tanaka, Kazuyo
Author_Institution
Grad. Sch. of Libr., Inf. & Media Studies, Univ. of Tsukuba, Tsukuba, Japan
fYear
2010
fDate
23-27 Aug. 2010
Firstpage
790
Lastpage
794
Abstract
This paper presents a flexible voice morphing method based on conversion using a linear combination of multi-speakers´ vocal tract area functions, in which phonological identity is maintained in terms of the overall interpolated area. In this system, the characteristic of vocal tract resonances is separated from that of glottal source waves using AR-HMM analysis of speech. The vocal tract resonances and glottal source wave characteristics are independently morphed. For the morphing of vocal tract resonances, log area vocal tract functions, which are derived from AR coefficients, are normalized and then processed by statistical mapping technique. For glottal source waves, statistical mapping is conducted in the cepstrum domain. Morphed speech is re-synthesized by an AR filter of converted glottal source waves which is re-synthesized using a cepstrum domain conversion. With the proposed morphing system, the continuity of formants and perceptual differences between a conventional method and the proposed method are confirmed.
Keywords
cepstral analysis; hidden Markov models; speaker recognition; speech processing; AR filter; AR-HMM speech analysis; cepstrum domain conversion; flexible voice morphing; glottal source wave characteristics; glottal source waves; linear combination; multispeaker vocal tract area functions; phonological identity; statistical mapping; vocal tract resonances; Analytical models; Cepstrum; Estimation; Hidden Markov models; Interpolation; Speech; Training;
fLanguage
English
Publisher
ieee
Conference_Titel
Signal Processing Conference, 2010 18th European
Conference_Location
Aalborg
ISSN
2219-5491
Type
conf
Filename
7096532
Link To Document