DocumentCode
1923343
Title
Dynamic segmentation of vocal extract for Assamese Speech to Text Conversion using RNN
Author
Dutta, Krishna ; Sarma, Kandarpa Kumar
Author_Institution
Dept. of Electron. & Commun. Technol., Gauhati Univ., Guwahati, India
fYear
2012
fDate
2-3 March 2012
Firstpage
126
Lastpage
131
Abstract
The current work proposes a prototype Speech to Text Conversion System (STSC) in Assamese language using Linear Predictive Coding (LPC) and Recurrent Neural Network(RNN). The LPC features are extracted from utterances of isolated phonemes of Assamese language (a major language of North-East India). These are used to train a RNN by a proposed dynamic method. The proposed method segments an utterance with an optimal dynamic criterion to improve the success scores during testing of the STCS system. The proposed method dynamically adjusts the length of the windows required for recognizing different phonemes. The performance of the proposed method is compared with a conventional static RNN based STCS system which is trained using prior knowledge about length of windows required for recognizing different phonemes.
Keywords
feature extraction; linear predictive coding; natural language processing; recurrent neural nets; speech recognition; Assamese language; Assamese speech to text conversion system; LPC; LPC feature extraction; RNN based STCS system; linear predictive coding; optimal dynamic criterion; phoneme recognition; recurrent neural network; vocal extract dynamic segmentation; Feature extraction; Finite impulse response filter; Speech; Speech processing; Speech recognition; Testing; Training; Dynamic Segmentation; LPC; Moving Average Filter; RNN; SCTS;
fLanguage
English
Publisher
ieee
Conference_Titel
Computational Intelligence and Signal Processing (CISP), 2012 2nd National Conference on
Conference_Location
Guwahati, Assam
Print_ISBN
978-1-4577-0719-3
Type
conf
DOI
10.1109/NCCISP.2012.6189692
Filename
6189692
Link To Document