Dynamic segmentation of vocal extract for Assamese Speech to Text Conversion using RNN

Author

Dutta, Krishna ; Sarma, Kandarpa Kumar

Author_Institution

Dept. of Electron. & Commun. Technol., Gauhati Univ., Guwahati, India

fYear

2012

fDate

2-3 March 2012

Firstpage

126

Lastpage

131

Abstract

The current work proposes a prototype Speech to Text Conversion System (STSC) in Assamese language using Linear Predictive Coding (LPC) and Recurrent Neural Network(RNN). The LPC features are extracted from utterances of isolated phonemes of Assamese language (a major language of North-East India). These are used to train a RNN by a proposed dynamic method. The proposed method segments an utterance with an optimal dynamic criterion to improve the success scores during testing of the STCS system. The proposed method dynamically adjusts the length of the windows required for recognizing different phonemes. The performance of the proposed method is compared with a conventional static RNN based STCS system which is trained using prior knowledge about length of windows required for recognizing different phonemes.

Keywords

feature extraction; linear predictive coding; natural language processing; recurrent neural nets; speech recognition; Assamese language; Assamese speech to text conversion system; LPC; LPC feature extraction; RNN based STCS system; linear predictive coding; optimal dynamic criterion; phoneme recognition; recurrent neural network; vocal extract dynamic segmentation; Feature extraction; Finite impulse response filter; Speech; Speech processing; Speech recognition; Testing; Training; Dynamic Segmentation; LPC; Moving Average Filter; RNN; SCTS;

fLanguage

English

Publisher

ieee

Conference_Titel

Computational Intelligence and Signal Processing (CISP), 2012 2nd National Conference on

Conference_Location

Guwahati, Assam

Print_ISBN

978-1-4577-0719-3

Type

conf

DOI

10.1109/NCCISP.2012.6189692

Filename

6189692