DocumentCode
1843456
Title
Voice conversion with UBM and speaker-specific model adaptation
Author
Chunlei Zhu ; Yibiao Yu
Author_Institution
Sch. of Electron. & Inf. Eng., Soochow Univ., Suzhou, China
Volume
1
fYear
2012
fDate
21-25 Oct. 2012
Firstpage
553
Lastpage
556
Abstract
Traditional voice conversion algorithms are usually based on parallel speech corpus and joint training, but it is difficult to obtain parallel data and inflexible to extend system in practical application. This paper presents a non-parallel and non-joint training algorithm for voice conversion using Universal Background Model (UBM) and Maximum a Posteriori (MAP) adaptation approach. First of all, a UBM is trained reflecting the speaker-independent statistical distribution of features using non-parallel speech samples of all speakers, then with the UBM acting as the prior model, every speaker-specific model is derived by using new parameter estimation based on MAP adaptation. Experimental results show that the proposed method achieves equivalent conversion performance comparing to traditional parallel corpus based method and has more flexible system extension ability.
Keywords
maximum likelihood estimation; natural language processing; speaker recognition; statistical distributions; MAP adaptation approach; UBM; maximum a posteriori; nonjoint training algorithm; nonparallel speech sample; parallel speech corpus; parameter estimation; speaker specific model adaptation; statistical feature distribution; universal background model; voice conversion algorithm; MAP; UBM; Voice conversion; non-joint training; non-parallel corpus;
fLanguage
English
Publisher
ieee
Conference_Titel
Signal Processing (ICSP), 2012 IEEE 11th International Conference on
Conference_Location
Beijing
ISSN
2164-5221
Print_ISBN
978-1-4673-2196-9
Type
conf
DOI
10.1109/ICoSP.2012.6491548
Filename
6491548
Link To Document