مرکز منطقه ای اطلاع رساني علوم و فناوري - Prosody model in a Mandarin text-to-speech system based on a hierarchical approach

DocumentCode :

356691

Title :

Prosody model in a Mandarin text-to-speech system based on a hierarchical approach

Author :

Pan, Neng-Huang ; Jen, Wen-Tsai ; Yu, Shyr-Shen ; Yu, Ming-shing ; Huang, Shyh-Yang ; Wu, Ming-Jer

Author_Institution :

Dept. of Appl. Math., Nat. Chung-Hsing Univ., Taichung, Taiwan

Volume :

fYear :

2000

fDate :

2000

Firstpage :

448

Abstract :

The authors developed a prosody model in a Mandarin text-to-speech (TTS) system. We extract some meaningful parameters form the voice files and text files. We find these parameters in a hierarchical way. For each syllable, we consider the following four parameters (there are five parameters in our duration prediction model): information of word (consonants, vowel and tone); information of phrase; information of breath group; and information of sentences (duration model add punctuation mark). In the syllable duration prediction model, there are 37% training syllables in the inside test and 43% test syllables in the outside test, with prediction error less than ratio 0.1. The average error of all syllables in the inside test is 0.182 and 0.169 in the outside test. In the syllable volume prediction model, there are 81% training syllables in the inside test and 76.2% test syllables in the outside test, with prediction error less than ratio 0.1. The average error of all syllables in the inside test is 0.176 and 0.166 in the outside test. For the performance evaluation of the pitch prediction module, there are 64% internal samples and 57% external samples with pattern error being within 5 Hz. The average pattern error of all syllables in the inside test is 5 Hz and 6 Hz in the outside test

Keywords :

natural languages; performance evaluation; speech synthesis; text analysis; Mandarin text-to-speech system; average error; average pattern error; breath group; duration prediction model; external samples; hierarchical approach; internal samples; outside test; pattern error; performance evaluation; phrase; pitch prediction module; prediction error; prosody model; punctuation mark; sentences; syllable duration prediction model; syllable volume prediction model; text files; training syllables; voice files; word information; Data mining; Mathematical model; Mathematics; Predictive models; Speech analysis; Speech synthesis; Statistics; Synthesizers; Testing; Text analysis;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Multimedia and Expo, 2000. ICME 2000. 2000 IEEE International Conference on

Conference_Location :

New York, NY

Print_ISBN :

0-7803-6536-4

Type :

conf

DOI :

10.1109/ICME.2000.869636

Filename :

869636

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=356691