مرکز منطقه ای اطلاع رساني علوم و فناوري - Designing prosody rule-set for converting neutral TTS speech to storytelling style speech for Indian languages: Bengali, Hindi and Telugu

DocumentCode :

234801

Title :

Designing prosody rule-set for converting neutral TTS speech to storytelling style speech for Indian languages: Bengali, Hindi and Telugu

Author :

Sarkar, Pradyut ; Haque, Ashraful ; Dutta, Achyut K. ; Reddy, Gurunath M. ; Harikrishna, M.D. ; Dhara, P. ; Verma, Rajesh ; Narendra, P.N. ; Sunil, B. Kr S. ; Yadav, J. ; Rao, K. Sreenivasa

Author_Institution :

Sch. of Inf. Technol., Indian Inst. of Technol., Kharagpur, Kharagpur, India

fYear :

2014

fDate :

7-9 Aug. 2014

Firstpage :

473

Lastpage :

477

Abstract :

This paper provides a design of prosody rule-set for transforming the neutral speech synthesized by Text-to-Speech (TTS) system to storytelling style speech. The objective of this work is to synthesize storyteller speech from the neutral TTS system for a given story text as input. In this work, neutral TTS refers to TTS system developed using Festival framework with neutral speech corpus. For generating storyteller speech from neutral TTS, we are proposing modifications to various prosodic parameters of neutral synthesized speech. In this work, the prosodic parameters considered for modification are (i) pitch contour, (ii) duration patterns, (iii) intensity patterns, (iv) pause patterns and (v) tempo. We have designed individual rule-sets for the above mentioned prosodic parameters, separately for three Indian languages Bengali, Hindi and Telugu. The rule-sets are designed carefully by analyzing the perceptual differences between synthesized neutral speech utterances and their respective natural (original) spoken utterances, narrated by a storyteller. The designed prosody rule-sets are evaluated using subjective listening tests. The results of the perceptual evaluation indicate that the designed prosody rule-sets play a significant role in achieving the story-specific style during conversion from neutral to storytelling style speech.

Keywords :

natural language processing; speech synthesis; Bengali; Hindi; Indian languages; Telugu; duration patterns; intensity patterns; neutral TTS system; pause patterns; pitch contour; prosody rule-set; storytelling style speech; text-to-speech system; Emotion recognition; Information technology; Semantics; Speech; Speech recognition; Speech synthesis; Expressive speech synthesis; Neutral TTS; Prosody rule-set; Story specific prosody generation; Storytelling style; emotion-salient words; story-specific emotion detection; story-specific prosody incorporation;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Contemporary Computing (IC3), 2014 Seventh International Conference on

Conference_Location :

Noida

Print_ISBN :

978-1-4799-5172-7

Type :

conf

DOI :

10.1109/IC3.2014.6897219

Filename :

6897219

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=234801