Title :
A comparative study of signal representations and classification techniques for speech recognition
Author :
Leung, Hong C. ; Chigier, Benjamin ; Glass, James R.
Author_Institution :
NYNEX Science & Technology Inc., White Plains, NY, USA
Abstract :
The authors investigate the interactions of two important sets of techniques in speech recognition: signal representation and classification. In addition, in order to quantify the effect of the telephone network, experiments are performed on both wideband and telephone-quality speech. The spectral and cepstral signal processing techniques studied fall into a few major categories based on Fourier analyses, linear prediction, and auditory processing. The classification techniques examined are Gaussian, mixture Gaussians, and the multilayer perceptron (MLP). Results indicate that the MLP consistently produces lower error rates than the other two classifiers. When averaged across all three classifiers, the Bark auditory spectral coefficients (BASC) produce the lowest phonetic classification error rates. When evaluated in a stochastic segment framework using the MLP, BASC also produces the lowest word error rate.<>
Keywords :
Fourier analysis; acoustic signal processing; feedforward neural nets; filtering and prediction theory; speech recognition; telephone networks; Bark auditory spectral coefficients; Fourier analyses; MLP; auditory processing; cepstral signal processing; classification techniques; error rates; linear prediction; multilayer perceptron; phonetic classification; signal representation; speech recognition; stochastic segment; telephone network;
Conference_Titel :
Acoustics, Speech, and Signal Processing, 1993. ICASSP-93., 1993 IEEE International Conference on
Conference_Location :
Minneapolis, MN, USA
Print_ISBN :
0-7803-7402-9
DOI :
10.1109/ICASSP.1993.319402