DocumentCode :
3415096
Title :
Prediction of speech quality using radial basis functions neural networks
Author :
Meky, Mohamd M. ; Saadawi, Tarek N.
Author_Institution :
Dept. of Electr. Eng., City Coll. of New York, NY, USA
fYear :
1997
fDate :
1-3 Jul 1997
Firstpage :
174
Lastpage :
178
Abstract :
The goal of this paper is to propose a new perceptually-based objective technique that uses radial basis functions neural networks, instead of regression algorithms, to estimate the nonlinear mapping function that best represents the relationship among input (perceptual parameters) and output (speech quality) variables in a database. In the proposed technique, the perceptual parameters are obtained by: (1) emulating several known features of perceptual processing of speech sounds by the human ear (including critical-band masking, equal loudness, and the intensity-loudness power law operations) to map the speech power spectrum into the auditory power spectrum (bark domain), (2) deriving the perceptual LPC coefficients from the auditory spectrum that is used to calculate, for each frame, the cepstrum distance between the input and the output coded speech signals; (3) using the radial basis functions neural network to map the perceptual cepstrum distance per frame into the corresponding estimated speech quality. After extensive experimentation and validation of the proposed techniques, the results indicate that the proposed technique is shown to be effective for estimating the coded speech quality
Keywords :
feedforward neural nets; hearing; linear predictive coding; spectral analysis; speech coding; speech intelligibility; speech processing; auditory power spectrum; cepstrum distance; coded speech signals; critical-band masking; database; equal loudness; experiment; human ear; intensity-loudness power law; nonlinear mapping function; output variables; perceptual LPC coefficients; perceptual cepstrum distance; perceptual parameters; perceptual processing; perceptually-based objective technique; radial basis functions neural networks; speech power spectrum; speech quality prediction; speech sounds; Cepstrum; Ear; Humans; Linear predictive coding; Neural networks; Radial basis function networks; Spatial databases; Speech analysis; Speech coding; Speech processing;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computers and Communications, 1997. Proceedings., Second IEEE Symposium on
Conference_Location :
Alexandria
Print_ISBN :
0-8186-7852-6
Type :
conf
DOI :
10.1109/ISCC.1997.615991
Filename :
615991
Link To Document :
بازگشت