DocumentCode :
3614568
Title :
Robust unit selection based on syllable prosody parameters
Author :
C. Erdem;F. Beck;D. Hirschfeld;H. Hoege;R. Hoffmann
Author_Institution :
Corp. Technol., Siemens AG, Germany
fYear :
2002
fDate :
6/24/1905 12:00:00 AM
Firstpage :
159
Lastpage :
162
Abstract :
In this paper we present a robust unit selection based on syllable prosody parameters (RUSSPP). After almost isolated neural network (NN) predictions of f0-contours and segmental durations these parameters are re-utilized for a search in our database for best fitting speech elements and acoustic prosody parameters. This search is realized by using a Viterbi-algorithm that operates on syllable level but explicitly allows higher and lower level speech elements. Dealing with a limited speech corpus for synthesis makes signal processing necessary for speech elements at concatenation points. Also a simple but efficient post-processing on the selected prosody parameters is introduced. This new method is applied and tested within our TTS system PA-PAGENO for a German male news speaker. It could be shown that it improves the quality of our prosody generation module and of the selection process.
Keywords :
"Robustness","Computational efficiency","Neural networks","Speech synthesis","Distortion measurement","Cost function","Fuzzy logic","Speech processing","Signal processing algorithms","Acoustic distortion"
Publisher :
ieee
Conference_Titel :
Speech Synthesis, 2002. Proceedings of 2002 IEEE Workshop on
Print_ISBN :
0-7803-7395-2
Type :
conf
DOI :
10.1109/WSS.2002.1224398
Filename :
1224398
Link To Document :
بازگشت