DocumentCode :
1749619
Title :
Integration of fixed and multiple resolution analysis in a speech recognition system
Author :
Gemello, Roberto ; Albesano, Dario ; Moisa, Loreta ; de Mori, Renato
Author_Institution :
Centro Studi e Lab. Telecommun. SpA, Torino, Italy
Volume :
1
fYear :
2001
fDate :
2001
Firstpage :
121
Abstract :
Compares the performance of an operational automatic speech recognition system when Mel frequency-scaled cepstral coefficients (MFCCs), J-Rasta perceptual linear prediction coefficients (J-Rasta PLP) and energies from a multi resolution analysis (MRA) tree of filters are used as input features to a hybrid system consisting of a neural network (NN) which provides observation probabilities for a network of hidden Markov models (HMM). Furthermore, the paper compares the performance of the system when various combinations of these features are used showing a WER reduction of 16% w.r.t. the use of J-Rasta PLP coefficients, when J-Rasta PLP coefficients are combined with the energies computed at the output of the leaves of an MRA filter tree. Such a combination is practically feasible thanks to the NN architecture used in the system. Recognition is performed without any language model on a very large test set including many speakers uttering proper names from different locations of the Italian public telephone network
Keywords :
cepstral analysis; feedforward neural nets; filtering theory; hidden Markov models; linear predictive coding; probability; speech recognition; wavelet transforms; Italian public telephone network; J-Rasta perceptual linear prediction coefficients; MFCCs; Mel frequency-scaled cepstral coefficients; filter tree; fixed resolution analysis; hidden Markov models; multiple resolution analysis; neural network; observation probabilities; speech recognition system; word error rate; Automatic speech recognition; Cepstral analysis; Computer architecture; Hidden Markov models; Mel frequency cepstral coefficient; Neural networks; Nonlinear filters; Performance analysis; Performance evaluation; Speech analysis;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech, and Signal Processing, 2001. Proceedings. (ICASSP '01). 2001 IEEE International Conference on
Conference_Location :
Salt Lake City, UT
ISSN :
1520-6149
Print_ISBN :
0-7803-7041-4
Type :
conf
DOI :
10.1109/ICASSP.2001.940782
Filename :
940782
Link To Document :
بازگشت