Title :
A hybrid Text-to-Speech synthesis using vowel and non vowel like regions
Author :
Adiga, N. ; Mahadeva Prasanna, S.R.
Author_Institution :
Dept. of Electron. & Electr. Eng., Indian Inst. of Technol., Guwahati, Guwahati, India
Abstract :
This paper presents a hybrid Text-to-Speech synthesis (TTS) approach by combining advantages present in both Hidden Markov model speech synthesis (HTS) and Unit selection speech synthesis (USS). In hybrid TTS, speech sound units are classified into vowel like regions (VLRs) and non vowel like regions (NVLRs) for selecting the units. The VLRs here refers to vowel, diphthong, semivowel and nasal sound units [1], which can be better modeled from HMM framework and hence waveforms units are chosen from HTS. Remaining sound units such as stop consonants,fricatives and affricates, which are not modeled properly using HMM [2] are classified as NVLRs and for these phonetic classes natural sound units are picked from USS. The VLRs and NVLRs evidence obtained from manual and automatic segmentation of speech signal. The automatic detection is done by fusing source features obtained from Hilbert envelope (HE) and Zero frequency filter (ZFF) of speech signal. Speech synthesized from manual and automated hybrid TTS method is compared with HTS and USS voice using subjective and objective measures. Results show that synthesis quality of hybrid TTS in case of manual segmentation is better compared to HTS voice, whereas automatic segmentation has slightly inferior quality.
Keywords :
filtering theory; hidden Markov models; speech processing; speech synthesis; HE; HMM framework; HTS voice; Hidden Markov model speech synthesis; Hilbert envelope; NVLR; USS voice; VLR; ZFF; automated hybrid TTS method; automatic speech signal segmentation; diphthong units; hybrid text-to-speech synthesis; nasal sound units; nonvowel like regions; semivowel units; source feature fusion; speech sound unit classification; unit selection speech synthesis; vowel like regions; waveform units; zero frequency filter; Cepstral analysis; Databases; Hidden Markov models; High-temperature superconductors; Manuals; Speech; Speech synthesis; HTS; VLRs and NVLRs; hybrid TTS; speech synthesis; unit selection;
Conference_Titel :
India Conference (INDICON), 2014 Annual IEEE
Conference_Location :
Pune
Print_ISBN :
978-1-4799-5362-2
DOI :
10.1109/INDICON.2014.7030526