• DocumentCode
    1607260
  • Title

    Better human computer interaction by enhancing the quality of text-to-speech synthesis

  • Author

    Reddy, V.R. ; Rao, K. Sreenivasa

  • Author_Institution
    Sch. of Inf. Technol., Indian Inst. of Technol. Kharagpur, Kharagpur, India
  • fYear
    2012
  • Firstpage
    1
  • Lastpage
    6
  • Abstract
    In this paper we propose high quality prosody models for enhancing the quality of text-to-speech (TTS) synthesis for providing better human computer interaction. In this study prosody refers to duration and intonation patterns of the sequence of syllables. In this work, prosody models are developed using feedforward neural networks, and prosodic information is predicted from linguistic and production constraints of syllables. The prediction accuracy of the proposed neural network based prosody models is compared objectively with Classification and Regression Tree based prosody models used by Festival. Subjective listening tests are also performed to evaluate the quality of the synthesized speech generated by incorporating the predicted prosodic features. From the evaluation studies, it is observed that prediction accuracy is better for neural network models, compared to other models.
  • Keywords
    feedforward neural nets; human computer interaction; speech synthesis; text analysis; TTS synthesis quality enhancement; feedforward neural networks; human computer interaction; linguistic constraints; production constraints; prosodic information prediction; prosody models; subjective listening tests; syllable sequence duration; syllable sequence intonation patterns; text-to-speech synthesis quality enhancement; Accuracy; Computational modeling; Neural networks; Pragmatics; Predictive models; Production; Speech; Human computer interaction; festival; neural networks; prosody; text-to-speech;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Intelligent Human Computer Interaction (IHCI), 2012 4th International Conference on
  • Conference_Location
    Kharagpur
  • Print_ISBN
    978-1-4673-4367-1
  • Type

    conf

  • DOI
    10.1109/IHCI.2012.6481857
  • Filename
    6481857