Stress annotated Urdu speech corpus to build female voice for TTS

Author

Benazir Mumtaz;Saba Urooj;Sarmad Hussain;Wajiha Habib

Author_Institution

Centre for Language Engineering, Al-Khawarizmi Institute of Computer Science, University of Engineering and Technology, Lahore, Pakistan

fYear

2015

Firstpage

13

Lastpage

20

Abstract

This research describes the stress annotation process for the two hours of Urdu speech corpus containing 18,640 words and 28,866 syllables to build a natural voice for Text-to-speech (TTS) system. For the stress annotation of speech corpus, two algorithms i.e. phonological and acoustic stress marking algorithms have been tested in comparison to perceptual stress marking. Urdu phonological stress markings algorithm [1] reports 70% accuracy whereas Urdu acoustic stress marking algorithm developed through this research reports 81.2% accuracy. This acoustic stress marking algorithm is then used to annotate two hours of Urdu speech corpus. It is a semi-automatic acoustic stress marking algorithm, which annotates 54% data automatically using duration cue whereas 46% data is marked manually using the acoustic cues of pitch, glottalization and intensity.

Keywords

"Stress","Acoustics","Speech","Indexes","Software"

Publisher

ieee

Conference_Titel

Oriental COCOSDA held jointly with 2015 Conference on Asian Spoken Language Research and Evaluation (O-COCOSDA/CASLRE), 2015 International Conference

Type

conf

DOI

10.1109/ICSDA.2015.7357857

Filename

7357857