• DocumentCode
    3713032
  • Title

    Corpus design and development of an annotated speech database for Punjabi

  • Author

    Shweta Bansal;Shambhu Sharan;S.S. Agrawal

  • Author_Institution
    KUT College of Engineering, Gurgaon, India
  • fYear
    2015
  • Firstpage
    32
  • Lastpage
    37
  • Abstract
    Punjabi is an important Indo-Aryan languages spoken in India and in some other countries especially Pakistan. It is a tonal language and its phonetic and phonological aspects have not been studied very much. The present paper reports development of phonemically annotated speech database of Malwai dialect of Punjabi. A phonetically rich text database of 1500 words and 300 sentences from a corpus of about 300,000 words was created. These were recorded by 25 male and 25 female speaker format with sampling rate of 16 kHz and 16 bit. The recordings were made in the native places of speakers possessing the original version the Malwai dialect of Punjabi. The recorded data was segmented and labeled phonemically to get the phonemic and sub-phonemic elements of each phoneme and the tonemes of Punjabi language. The annotated database can be useful for phonetic studies and to develop Punjabi speech synthesis system.
  • Keywords
    "Databases","Speech","Frequency modulation","Dentistry"
  • Publisher
    ieee
  • Conference_Titel
    Oriental COCOSDA held jointly with 2015 Conference on Asian Spoken Language Research and Evaluation (O-COCOSDA/CASLRE), 2015 International Conference
  • Type

    conf

  • DOI
    10.1109/ICSDA.2015.7357860
  • Filename
    7357860