DocumentCode
2247235
Title
Prosody generation in TTS system for Azeri
Author
Damadi, M.S. ; Azami, Bahram Zahir ; Eslami, Moharram
Author_Institution
Univ. of Kurdistan, Sanandaj, Iran
fYear
2010
fDate
6-9 July 2010
Firstpage
1335
Lastpage
1338
Abstract
Naturalness in Text-to-Speech (TTS) systems is very important in achieving high quality waveform. The naturalness of the waveform is highly correlated to phonetic coverage and prosodic features such as loudness, duration and pitch. This paper addresses the implementation of a prosodic TTS for Azeri. The TTS system to which the prosodic information is added, is a concatenative synthesizer based on diphones. For adding prosody and increasing naturalness, we have obtained a primary pitch curve for each word, based on the location of the stressed syllable. Also using sentence type effects, the final pitch contour has been modified. As far as we know, the output speech that is produced with this system is the first prosodic Azeri synthetic speech ever created. High intelligibility and acceptable naturalness of the synthesized speech have been confirmed by subjective listening tests.
Keywords
natural language processing; speech synthesis; concatenative synthesizer; diphones; phonetic coverage; prosodic Azeri synthetic speech; prosodic feature; prosodic text-to-speech system; prosody generation; sentence type effects; speech synthesis; Conferences; Mechatronics; F0 contour; concatenation; diphone; intonation; pitch pattern; speech synthesis; stress;
fLanguage
English
Publisher
ieee
Conference_Titel
Advanced Intelligent Mechatronics (AIM), 2010 IEEE/ASME International Conference on
Conference_Location
Montreal, ON
Print_ISBN
978-1-4244-8031-9
Type
conf
DOI
10.1109/AIM.2010.5695772
Filename
5695772
Link To Document