Title :
Voice conversion with UBM and speaker-specific model adaptation
Author :
Chunlei Zhu ; Yibiao Yu
Author_Institution :
Sch. of Electron. & Inf. Eng., Soochow Univ., Suzhou, China
Abstract :
Traditional voice conversion algorithms are usually based on parallel speech corpus and joint training, but it is difficult to obtain parallel data and inflexible to extend system in practical application. This paper presents a non-parallel and non-joint training algorithm for voice conversion using Universal Background Model (UBM) and Maximum a Posteriori (MAP) adaptation approach. First of all, a UBM is trained reflecting the speaker-independent statistical distribution of features using non-parallel speech samples of all speakers, then with the UBM acting as the prior model, every speaker-specific model is derived by using new parameter estimation based on MAP adaptation. Experimental results show that the proposed method achieves equivalent conversion performance comparing to traditional parallel corpus based method and has more flexible system extension ability.
Keywords :
maximum likelihood estimation; natural language processing; speaker recognition; statistical distributions; MAP adaptation approach; UBM; maximum a posteriori; nonjoint training algorithm; nonparallel speech sample; parallel speech corpus; parameter estimation; speaker specific model adaptation; statistical feature distribution; universal background model; voice conversion algorithm; MAP; UBM; Voice conversion; non-joint training; non-parallel corpus;
Conference_Titel :
Signal Processing (ICSP), 2012 IEEE 11th International Conference on
Conference_Location :
Beijing
Print_ISBN :
978-1-4673-2196-9
DOI :
10.1109/ICoSP.2012.6491548