DocumentCode
1904533
Title
Statistical syllables selection approach for the preparation of Punjabi speech database
Author
Singh, Parminder ; Lehal, Gurpreet Singh
Author_Institution
Dept. of Comput. Sci. & Eng., Nanak Dev Eng. Coll., India
fYear
2010
fDate
8-11 Nov. 2010
Firstpage
1
Lastpage
4
Abstract
This paper discusses the results of the statistical analysis of Punjabi syllables over a large Punjabi corpus. Syllables have been reported as good choice of speech unit for speech database of many languages. For this work also, syllables have been selected as the speech unit for the development of the Punjabi speech database. For minimizing the database size, efforts have been made for the selection of the minimal set of syllables covering almost whole Punjabi word set. For this all Punjabi syllables have been statistically analyzed on the Punjabi corpus having more than 104 million words. Interesting and very important results have been obtained from this analysis those helps to select a relatively smaller syllable set (about first ten thousand syllables (0.86% of total syllables)) of most frequently occurring syllables having cumulative frequency of occurrence (FOO) less than 99.81%, out of 1156740 total available syllables. Also to improve the efficiency of the text-to-speech (TTS) system; interesting facts about Punjabi syllables have been obtained based on their FOO at the three (starting, middle and end) positions in the words. indented.
Keywords
database management systems; speech synthesis; statistical analysis; FOO; Punjabi corpus; Punjabi speech database; Punjabi word set; TTS; database size; frequency of occurrence; speech unit; statistical analysis; statistical syllables selection approach; text-to-speech;
fLanguage
English
Publisher
ieee
Conference_Titel
Internet Technology and Secured Transactions (ICITST), 2010 International Conference for
Conference_Location
London
Print_ISBN
978-1-4244-8862-9
Electronic_ISBN
978-0-9564263-6-9
Type
conf
Filename
5678557
Link To Document