• DocumentCode
    2037302
  • Title

    Significance of segmentation in phoneme based Tamil speech recognition system

  • Author

    Harish, S. ; Vijayalakshmi, P. ; Nagarajan, T.

  • Author_Institution
    Dept. of Electron. & Commun. Eng., Rajiv Gandhi Salai, Chennai, India
  • Volume
    3
  • fYear
    2011
  • fDate
    8-10 April 2011
  • Firstpage
    212
  • Lastpage
    215
  • Abstract
    Over the last few decades speech recognition has evolved and matured enough to be used in commercial applications. The applications include automatic dictation software, voice dialling, voice controlled navigation and simple data entry. Automatic Speech Recognition (ASR) deals with automatic conversion of acoustic signals of an utterance into text. In this work speech recognition system for Tamil language is developed. Speech recognition requires segmentation of speech waveform into fundamental acoustic units. Word is the natural unit of speech. However, each word has to be trained individually and there cannot be any sharing of parameters among words. Hence, it is essential to have a very large training set so that all words in the vocabulary are adequately trained. Also there is a problem with memory requirement which grows linearly with the number of words. The preferred unit to overcome this constraint is phone unit. It has less number of models and they are well trained. For the current work, phone units such as monophones and triphones are considered. This work highlights the importance of the segmented speech, language model and co-articulation effect which influences the speech production. Triphone is a phone unit which considers the co-articulation effect. Monophone and triphone based speech recognition systems for Tamil are developed and their performance shows the importance of the above mentioned parameters.
  • Keywords
    natural language processing; speech recognition; Tamil language; acoustic signals conversion; automatic dictation software; automatic speech recognition; monophones; phoneme based Tamil speech recognition system; speech waveform segmentation; triphones; voice controlled navigation; voice dialling; Accuracy; Context; Context modeling; Data models; Hidden Markov models; Speech; Speech recognition; co-articulation; language model; lexicon; segmentation; speech recognition;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Electronics Computer Technology (ICECT), 2011 3rd International Conference on
  • Conference_Location
    Kanyakumari
  • Print_ISBN
    978-1-4244-8678-6
  • Electronic_ISBN
    978-1-4244-8679-3
  • Type

    conf

  • DOI
    10.1109/ICECTECH.2011.5941739
  • Filename
    5941739