Title :
Transformation of speaker characteristics for voice conversion
Author :
Rentzos, Dimitrios ; Vaseghi, Saeed ; Turajlic, Emir ; Yan, Qin ; Ho, Ching-Hsiang
Author_Institution :
Dept. of Electron. & Comput. Eng., Brunel Univ., Uxbridge, UK
fDate :
30 Nov.-3 Dec. 2003
Abstract :
The paper presents a voice conversion method based on analysis and transformation of the characteristics that define a speaker´s voice. Voice characteristic features are grouped into three main categories: (a) the spectral features at formants; (b) the pitch and intonation pattern; (c) the glottal pulse shape. Modelling and transformation methods for each group of voice features are outlined. The spectral features at formants are modelled using a two-dimensional phoneme-dependent HMM. Subband frequency warping is used for spectrum transformation where the subbands are centred on estimates of formant trajectories. The F0 contour, extracted from autocorrelation-based pitchmarks, is used for modelling the pitch and intonation patterns of speech. A PSOLA based method is used for transformation of pitch, intonation patterns and speaking rate. Finally a method based on deconvolution of the vocal tract is used for modelling and mapping of the glottal pulse. The experimental results present illustrations of transformations of the various characteristics and perceptual evaluations.
Keywords :
correlation methods; feature extraction; hidden Markov models; learning (artificial intelligence); parameter estimation; speech processing; F0 contour; HMM training; formant spectral features; formant trajectories; glottal pulse shape; intonation pattern; phoneme-dependent HMM; pitch pattern; speaker characteristics transformation; spectrum transformation; subband frequency warping; vocal tract deconvolution; voice conversion; voice feature extraction; voice mapping; Autocorrelation; Bandwidth; Feature extraction; Frequency estimation; Hidden Markov models; Pulse shaping methods; Shape; Speech processing; Speech synthesis; Viterbi algorithm;
Conference_Titel :
Automatic Speech Recognition and Understanding, 2003. ASRU '03. 2003 IEEE Workshop on
Print_ISBN :
0-7803-7980-2
DOI :
10.1109/ASRU.2003.1318526