DocumentCode :
2726440
Title :
Bangla Speech Recognition System Using LPC and ANN
Author :
Paul, Anup Kumar ; Das, Dipankar ; Kamal, Md Mustafa
Author_Institution :
Dhaka City Coll., Dhaka
fYear :
2009
fDate :
4-6 Feb. 2009
Firstpage :
171
Lastpage :
174
Abstract :
This paper presents the Bangla speech recognition system. Bangla speech recognition system is divided mainly into two major parts. The first part is speech signal processing and the second part is speech pattern recognition technique. The speech processing stage consists of speech starting and end point detection, windowing, filtering, calculating the linear predictive coding (LPC) and cepstral coefficients and finally constructing the codebook by vector quantization. The second part consists of pattern recognition system using artificial neural network (ANN). Speech signals are recorded using an audio wave recorder in the normal room environment. The recorded speech signal is passed through the speech starting and end-point detection algorithm to detect the presence of the speech signal and remove the silence and pauses portions of the signals. The resulting signal is then filtered for the removal of unwanted background noise from the speech signals. The filtered signal is then windowed ensuring half frame overlap. After windowing, the speech signal is then subjected to calculate the LPC coefficient and cepstral coefficient. The feature extractor uses a standard LPC cepstrum coder, which converts the incoming speech signal into LPC cepstrum feature space. The self organizing map (SOM) neural network makes each variable length LPC trajectory of an isolated word into a fixed length LPC trajectory and thereby making the fixed length feature vector, to be fed into to the recognizer. The structures of the neural network is designed with multi layer perceptron approach and tested with 3, 4, 5 hidden layers using the Transfer functions of Tanh Sigmoid for the Bangla speech recognition system. Comparison among different structures of neural networks conducted here for a better understanding of the problem and its possible solutions.
Keywords :
linear predictive coding; multilayer perceptrons; self-organising feature maps; speech coding; speech recognition; vector quantisation; ANN; Bangla speech recognition system; LPC; audio wave recorder; cepstral coefficients; end point detection; feature extractor; linear predictive coding; multilayer perceptron approach; noise removal; pattern recognition technique; self organizing map neural network; vector quantization; Artificial neural networks; Cepstral analysis; Cepstrum; Linear predictive coding; Neural networks; Pattern recognition; Signal processing; Speech processing; Speech recognition; Trajectory; LPC; Neural; Speech;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Advances in Pattern Recognition, 2009. ICAPR '09. Seventh International Conference on
Conference_Location :
Kolkata
Print_ISBN :
978-1-4244-3335-3
Type :
conf
DOI :
10.1109/ICAPR.2009.80
Filename :
4782767
Link To Document :
بازگشت