DocumentCode :
2037302
Title :
Significance of segmentation in phoneme based Tamil speech recognition system
Author :
Harish, S. ; Vijayalakshmi, P. ; Nagarajan, T.
Author_Institution :
Dept. of Electron. & Commun. Eng., Rajiv Gandhi Salai, Chennai, India
Volume :
3
fYear :
2011
fDate :
8-10 April 2011
Firstpage :
212
Lastpage :
215
Abstract :
Over the last few decades speech recognition has evolved and matured enough to be used in commercial applications. The applications include automatic dictation software, voice dialling, voice controlled navigation and simple data entry. Automatic Speech Recognition (ASR) deals with automatic conversion of acoustic signals of an utterance into text. In this work speech recognition system for Tamil language is developed. Speech recognition requires segmentation of speech waveform into fundamental acoustic units. Word is the natural unit of speech. However, each word has to be trained individually and there cannot be any sharing of parameters among words. Hence, it is essential to have a very large training set so that all words in the vocabulary are adequately trained. Also there is a problem with memory requirement which grows linearly with the number of words. The preferred unit to overcome this constraint is phone unit. It has less number of models and they are well trained. For the current work, phone units such as monophones and triphones are considered. This work highlights the importance of the segmented speech, language model and co-articulation effect which influences the speech production. Triphone is a phone unit which considers the co-articulation effect. Monophone and triphone based speech recognition systems for Tamil are developed and their performance shows the importance of the above mentioned parameters.
Keywords :
natural language processing; speech recognition; Tamil language; acoustic signals conversion; automatic dictation software; automatic speech recognition; monophones; phoneme based Tamil speech recognition system; speech waveform segmentation; triphones; voice controlled navigation; voice dialling; Accuracy; Context; Context modeling; Data models; Hidden Markov models; Speech; Speech recognition; co-articulation; language model; lexicon; segmentation; speech recognition;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Electronics Computer Technology (ICECT), 2011 3rd International Conference on
Conference_Location :
Kanyakumari
Print_ISBN :
978-1-4244-8678-6
Electronic_ISBN :
978-1-4244-8679-3
Type :
conf
DOI :
10.1109/ICECTECH.2011.5941739
Filename :
5941739
Link To Document :
بازگشت