DocumentCode :
1908904
Title :
A Novel Gaussian Filter-Based Automatic Labeling of Speech Data for TTS System in Gujarati Language
Author :
Talesara, Swati ; Patil, Hemant A. ; Patel, T. ; Sailor, Hardik ; Shah, Neil
Author_Institution :
Dhirubhai Ambani Inst. of Inf. & Commun. Technol. (DA-IICT), Gandhinagar, India
fYear :
2013
fDate :
17-19 Aug. 2013
Firstpage :
139
Lastpage :
142
Abstract :
Text-to-speech (TTS) synthesizer has been proved to be an aiding tool for many visually challenged people for reading through hearing feedback. There are TTS synthesizers available in English, however, it has been observed that people feel more comfortable in hearing their own native language. Keeping this point in mind, Gujarati TTS synthesizer has been built. This TTS system has been built in Festival speech synthesis framework. Syllable is taken as the basic unit in building Gujarati TTS synthesizer as Indian languages are syllabic in nature. In building the unit-selection based Gujarati TTS system, one requires large Gujarati labeled corpus. The task of labeling is most time-consuming and tedious. This task requires large manual efforts. Therefore, in this work, an attempt has been made to reduce these efforts by automatically generating labeled corpus at syllable-level. To that effect, a Gaussian-based segmentation method has been proposed for automatic segmentation of speech at syllable-level. It has been observed that percentage correctness of labeled data is around 80% for both male and female voice as compared to 70% for group delay-based labeling. In addition, the system built on the proposed approach shows better intelligibility when evaluated by a visually challenged subject. The word error rate is reduced by 5% for Gaussian filter-based TTS system, compared to group delay-based TTS system. Also, 5% increment is observed in correctly synthesized words. The main focus of this work is to reduce the manual efforts required in building TTS system (which are primarily the manual efforts required in labeling speech data) for Gujarati.
Keywords :
Gaussian processes; handicapped aids; natural language processing; speech synthesis; English; Gaussian filter-based TTS system; Gaussian filter-based automatic speech data labeling; Gaussian-based segmentation method; Gujarati TTS synthesizer; Gujarati language; aiding tool; automatic speech segmentation; correctly synthesized words; group delay-based TTS system; hearing feedback; labeled corpus; native language; syllable-level; text-to-speech synthesizer; unit-selection based Gujarati TTS system; visually challenged people; Gaussian filter; TTS; group delay; labeling; syllable;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Asian Language Processing (IALP), 2013 International Conference on
Conference_Location :
Urumqi
Type :
conf
DOI :
10.1109/IALP.2013.46
Filename :
6646022
Link To Document :
بازگشت