Title :
High accurate model-integration-based voice conversion using dynamic features and model structure optimization
Author :
Saito, Daisuke ; Watanabe, Shinji ; Nakamura, Atsushi ; Minematsu, Nobuaki
Author_Institution :
Univ. of Tokyo, Tokyo, Japan
Abstract :
This paper combines a parameter generation algorithm and a model optimization approach with the model-integration-based voice con version (MIVC). We have proposed probabilistic integration of a joint density model and a speaker model to mitigate a requirement of the parallel corpus in voice conversion (VC) based on Gaussian Mixture Model (GMM). As well as the other VC methods, MIVC also suffers from the problems; the degradation of the perceptual quality caused by the discontinuity through the parameter trajectory, and the difficulty to optimize the model structure. To solve the problems, this paper proposes a parameter generation algorithm constrained by dynamic features for the first problem and an information criterion including mutual influences between the joint density model and the speaker model for the second problem. Experimental results show that the first approach improved the performance of VC and the second approach appropriately predicted the optimal number of mixtures of the speaker model for our MIVC.
Keywords :
Gaussian distribution; integration; speaker recognition; GMM; Gaussian mixture model; MIVC; VC methods; dynamic feature; high accurate model-integration-based voice conversion; joint density model; model structure optimization; parameter generation algorithm; perceptual quality; speaker model; Adaptation models; Data models; Equations; Joints; Mathematical model; Optimization; Speech; Voice conversion; dynamic features; information criterion; probabilistic integration;
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2011 IEEE International Conference on
Conference_Location :
Prague
Print_ISBN :
978-1-4577-0538-0
Electronic_ISBN :
1520-6149
DOI :
10.1109/ICASSP.2011.5947373