DocumentCode
3697460
Title
Many-to-one voice conversion using exemplar-based sparse representation
Author
Ryo Aihara;Tetsuya Takiguchi;Yasuo Ariki
Author_Institution
Graduate School of System Informatics, Kobe University, Japan
fYear
2015
Firstpage
1
Lastpage
5
Abstract
Voice conversion (VC) is being widely researched in the field of speech processing because of increased interest in using such processing in applications such as personalized Text-to-Speech systems. We present in this paper a many-to-one VC method using exemplar-based sparse representation, which is different from conventional statistical VC. In our previous exemplar-based VC method, input speech was represented by the source dictionary and its sparse coefficients. The source and the target dictionaries are fully coupled and the converted voice is constructed from the source coefficients and the target dictionary. This method requires parallel exemplars (which consist of the source exemplars and target exemplars that have the same texts uttered by the source and target speakers) for dictionary construction. In this paper, we propose a many-to-one VC method in an exemplar-based framework which does not need training data of the source speaker. Some statistical approaches for many-to-one VC have been proposed; however, in the framework of exemplar-based VC, such a method has never been proposed. The effectiveness of our many-to-one VC has been confirmed by comparing its effectiveness with that of a conventional one-to-one NMF-based method and one-to-one GMM-based method.
Keywords
"Dictionaries","Speech","Training data","Sparse matrices","Matrix converters","Signal processing","Noise robustness"
Publisher
ieee
Conference_Titel
Applications of Signal Processing to Audio and Acoustics (WASPAA), 2015 IEEE Workshop on
Type
conf
DOI
10.1109/WASPAA.2015.7336943
Filename
7336943
Link To Document