DocumentCode
2792070
Title
Perceptual evaluation of dynamic cost weighting for unit selection TTS
Author
Bellegarda, Jerome R.
Author_Institution
Speech & Language Technol., Apple Inc., Cupertino, CA, USA
fYear
2010
fDate
14-19 March 2010
Firstpage
4806
Lastpage
4809
Abstract
Unit selection text-to-speech synthesis relies on multiple cost criteria, each encapsulating a different aspect of acoustic and prosodic context at any given concatenation point. For a particular set of criteria, the relative weighting of the resulting costs crucially affects final candidate ranking. We have recently advocated a new weighting strategy based on a data-driven framework separately optimized for each concatenation. In this approach, the cost distribution in every information stream is dynamically leveraged to locally shift weight towards those characteristics that prove most discriminative at this point. To further validate this procedure, this paper presents formal listening evidence suggesting that dynamic cost weighting indeed entails higher perceived TTS quality.
Keywords
optimisation; speech synthesis; acoustic context; concatenation point; cost distribution; dynamic cost weighting; final candidate ranking; formal listening evidence; information stream; multiple cost criteria; optimization; perceptual evaluation; prosodic context; unit selection TTS; unit selection text-to-speech synthesis; Cost function; Diversity reception; Extrapolation; Humans; Information analysis; Natural languages; Optimization methods; Pathology; Speech analysis; Speech synthesis; candidate ranking; concatenative speech synthesis; cost weighting; unit selection;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on
Conference_Location
Dallas, TX
ISSN
1520-6149
Print_ISBN
978-1-4244-4295-9
Electronic_ISBN
1520-6149
Type
conf
DOI
10.1109/ICASSP.2010.5495152
Filename
5495152
Link To Document