DocumentCode :
2838906
Title :
An acoustic and articulatory knowledge integrated method for improving synthetic Mandarin speech´s fluency
Author :
Hung-Yan Gu ; Wang, Kuo-Hsian
Author_Institution :
Dept. of Comput. Sci. & Inf. Eng., Nat. Taiwan Univ. of Sci. & Technol., Taipei, Taiwan
fYear :
2004
fDate :
15-18 Dec. 2004
Firstpage :
205
Lastpage :
208
Abstract :
In synthetic Mandarin speech, discontinuity of formant traces at syllable boundaries is a key factor that lowers the fluency level. Therefore, we study an acoustic and articulatory knowledge integrated method to solve this discontinuity problem. First, representative trisyllable contexts are selected and their signals are recorded. The signal of the middle syllable of each trisyllable pronunciation is then extracted to make a synthesis unit. To select a synthesis unit among multiple candidates, a distance function is defined to measure the spectral similarity between two synthesis units to be concatenated. In addition, several linking-restriction rules are derived, according to articulatory knowledge, to prevent some synthesis units being linked into a sequence. Then, a globally best synthesis-unit sequence is searched by using a dynamic programming based algorithm. When this method is applied, the formant traces at syllable boundaries become smoother. Also, subjective evaluation shows that the fluency level of synthetic Mandarin speech can indeed be improved a lot.
Keywords :
dynamic programming; knowledge based systems; natural languages; speech; speech synthesis; acoustic knowledge; articulatory knowledge; distance function; dynamic programming; formant trace discontinuities; linking-restriction rules; spectral similarity; syllable boundaries; synthesis unit; synthetic Mandarin speech fluency; trisyllable contexts; Acoustic testing; Acoustical engineering; Computer science; Concatenated codes; Dynamic programming; Heuristic algorithms; Signal synthesis; Speech analysis; Speech processing; Speech synthesis;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Chinese Spoken Language Processing, 2004 International Symposium on
Print_ISBN :
0-7803-8678-7
Type :
conf
DOI :
10.1109/CHINSL.2004.1409622
Filename :
1409622
Link To Document :
بازگشت