Title :
Non-parallel training for many-to-many eigenvoice conversion
Author :
Ohtani, Yamato ; Toda, Tomoki ; Saruwatari, Hiroshi ; Shikano, Kiyohiro
Author_Institution :
Grad. Sch. of Inf. Sci., Nara Inst. of Sci. & Technol., Nara, Japan
Abstract :
This paper presents a novel training method of an eigenvoice Gaussian mixture model (EV-GMM) effectively using non-parallel data sets for many-to-many eigenvoice conversion, which is a technique for converting an arbitrary source speaker´s voice into an arbitrary target speaker´s voice. In the proposed method, an initial EV-GMM is trained with the conventional method using parallel data sets consisting of a single reference speaker and multiple pre-stored speakers. Then, the initial EV-GMM is further refined using non-parallel data sets including a larger number of pre-stored speakers while considering the reference speaker´s voices as hidden variables. The experimental results demonstrate that the proposed method yields significant quality improvements in converted speech by enabling us to use data of a larger number of pre-stored speakers.
Keywords :
speaker recognition; speech processing; Gaussian mixture model; converted speech; many-to-many eigenvoice conversion; non-parallel training; Data mining; Information science; Loudspeakers; Microwave integrated circuits; Probability; Quality control; Speech; Virtual colonoscopy; Gaussian mixture model; Voice conversion; eigenvoice; many-to-many; non-parallel training;
Conference_Titel :
Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on
Conference_Location :
Dallas, TX
Print_ISBN :
978-1-4244-4295-9
Electronic_ISBN :
1520-6149
DOI :
10.1109/ICASSP.2010.5495139