DocumentCode :
1712347
Title :
Development and evaluation of unit selection and HMM-based speech synthesis systems for Tamil
Author :
Boothalingam, Ramani ; Sherlin Solomi, V ; Gladston, Anushiya Rachel ; Christina, S Lilly ; Vijayalakshmi, P ; Thangavelu, Nagarajan ; Murthy, Hema A
Author_Institution :
Speech Lab, SSN College of Engineering, India
fYear :
2013
Firstpage :
1
Lastpage :
5
Abstract :
An unrestricted text-to-speech system is expected to produce a speech signal, corresponding to the given text in a language, that is highly intelligible to a human listener. Presently, unit selection-based synthesis (USS) and statistical parametric synthesis techniques are the state-of-art techniques for this task. Earlier, in [3], a concatenative synthesizer was developed for the language, Tamil, using 12 hrs of speech data, and shown that syllable is the better subword unit. The current work focuses on building FestVox voices using phoneme/CV unit as the subword unit, for a reduced amount of speech data (5 hrs) and to compare their performances in terms of quality. Further, the focus is to compare the performance of this synthesizer with that of the well known HMM-based speech synthesizer. Among the phoneme and CV-based systems built, although there are bound to be more concatenation points in a phoneme-based system, it is observed that it triumphs the CV-based system with an MOS of 2.96, primarily because, there are more examples available for each phoneme for the given amount of speech data. Further, an HMM-based speech synthesis system is developed using 5 hrs data. Although, in the synthesized speech, the speaker identity is not completely preserved, there are no sonic-glitches and the quality obtained is much better than that of a phoneme/CV-based systems, with an MOS of 3.86. Further, the footprint size of the system is exorbitantly reduced from 1 GB in USS system to 6 MB in HMM-based speech synthesis system.
Keywords :
Buildings; Context modeling; Databases; Feature extraction; Hidden Markov models; Speech; Speech synthesis;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Communications (NCC), 2013 National Conference on
Conference_Location :
New Delhi, India
Print_ISBN :
978-1-4673-5950-4
Electronic_ISBN :
978-1-4673-5951-1
Type :
conf
DOI :
10.1109/NCC.2013.6487984
Filename :
6487984
Link To Document :
بازگشت