• DocumentCode
    1904533
  • Title

    Statistical syllables selection approach for the preparation of Punjabi speech database

  • Author

    Singh, Parminder ; Lehal, Gurpreet Singh

  • Author_Institution
    Dept. of Comput. Sci. & Eng., Nanak Dev Eng. Coll., India
  • fYear
    2010
  • fDate
    8-11 Nov. 2010
  • Firstpage
    1
  • Lastpage
    4
  • Abstract
    This paper discusses the results of the statistical analysis of Punjabi syllables over a large Punjabi corpus. Syllables have been reported as good choice of speech unit for speech database of many languages. For this work also, syllables have been selected as the speech unit for the development of the Punjabi speech database. For minimizing the database size, efforts have been made for the selection of the minimal set of syllables covering almost whole Punjabi word set. For this all Punjabi syllables have been statistically analyzed on the Punjabi corpus having more than 104 million words. Interesting and very important results have been obtained from this analysis those helps to select a relatively smaller syllable set (about first ten thousand syllables (0.86% of total syllables)) of most frequently occurring syllables having cumulative frequency of occurrence (FOO) less than 99.81%, out of 1156740 total available syllables. Also to improve the efficiency of the text-to-speech (TTS) system; interesting facts about Punjabi syllables have been obtained based on their FOO at the three (starting, middle and end) positions in the words. indented.
  • Keywords
    database management systems; speech synthesis; statistical analysis; FOO; Punjabi corpus; Punjabi speech database; Punjabi word set; TTS; database size; frequency of occurrence; speech unit; statistical analysis; statistical syllables selection approach; text-to-speech;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Internet Technology and Secured Transactions (ICITST), 2010 International Conference for
  • Conference_Location
    London
  • Print_ISBN
    978-1-4244-8862-9
  • Electronic_ISBN
    978-0-9564263-6-9
  • Type

    conf

  • Filename
    5678557