DocumentCode :
3124327
Title :
Perceptual clustering based unit selection optimization for concatenative text-to-speech synthesis
Author :
Tao Jiang ; Zhiyong Wu ; Jia Jia ; Lianhong Cai
Author_Institution :
Tsinghua-CUHK Joint Res. Center for Media Sci., Tsinghua Univ., Shenzhen, China
fYear :
2012
fDate :
5-8 Dec. 2012
Firstpage :
64
Lastpage :
68
Abstract :
In concatenative based speech synthesis, the purpose of unit selection is to select proper speech units from speech corpus by measuring how well the selected units match the given features. Perceptual test indicates that some features are always preferred to make perceptual distinction between units. Such features should be judged prior to others in unit selection. In this work, we attempt to identify the priorities for different features and try to optimize the unit selection with perceptual clustering. Out approach first clusters the speech units with hierarchical clustering based on a perceptual distance measurement between different speech units. A method to identify the questions (concerning the features) is then proposed to build the decision tree from the clustering result. The features used in the decision tree are the preferred ones, and the other features are used in the target cost function. Linear discriminant analysis (LDA) is then adopted to train the weights for the target cost function from the clustering result to make weights more reasonable and perceptual related.. Experimental results indicate that the optimized unit selection can generate synthetic speech with higher naturalness than the previous approach.
Keywords :
decision trees; pattern clustering; speech synthesis; statistical analysis; LDA; concatenative text-to-speech synthesis; decision tree; hierarchical clustering; linear discriminant analysis; perceptual clustering; perceptual distance measurement; speech corpus; target cost function; unit selection optimization; Cost function; Decision trees; Indexes; Speech; Speech synthesis; Training; Vectors; Perceptual clustering; cost function; decision tree; linear discriminant analyze; unit selection;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Chinese Spoken Language Processing (ISCSLP), 2012 8th International Symposium on
Conference_Location :
Kowloon
Print_ISBN :
978-1-4673-2506-6
Electronic_ISBN :
978-1-4673-2505-9
Type :
conf
DOI :
10.1109/ISCSLP.2012.6423489
Filename :
6423489
Link To Document :
بازگشت