DocumentCode :
3485521
Title :
Automatic detection of unnatural word-level segments in unit-selection speech synthesis
Author :
Wang, William Yang ; Georgila, Kallirroi
fYear :
2011
fDate :
11-15 Dec. 2011
Firstpage :
289
Lastpage :
294
Abstract :
We investigate the problem of automatically detecting unnatural word-level segments in unit selection speech synthesis. We use a large set of features, namely, target and join costs, language models, prosodic cues, energy and spectrum, and Delta Term Frequency Inverse Document Frequency (TF-IDF), and we report comparative results between different feature types and their combinations. We also compare three modeling methods based on Support Vector Machines (SVMs), Random Forests, and Conditional Random Fields (CRFs). We then discuss our results and present a comprehensive error analysis.
Keywords :
speech synthesis; support vector machines; CRF; SVM; TF-IDF; automatic detection; comprehensive error analysis; conditional random fields; delta term frequency inverse document frequency; language models; prosodic cues; random forests; selection speech synthesis; support vector machines; unit-selection speech synthesis; unnatural word-level segments; Acoustics; Feature extraction; Humans; Speech; Speech synthesis; Testing; Training;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Automatic Speech Recognition and Understanding (ASRU), 2011 IEEE Workshop on
Conference_Location :
Waikoloa, HI
Print_ISBN :
978-1-4673-0365-1
Electronic_ISBN :
978-1-4673-0366-8
Type :
conf
DOI :
10.1109/ASRU.2011.6163946
Filename :
6163946
Link To Document :
بازگشت