DocumentCode :
788181
Title :
A Multiresolution Model of Auditory Excitation Pattern and Its Application to Objective Evaluation of Perceived Speech Quality
Author :
Karmakar, Abhijit ; Kumar, Arun ; Patney, R.K.
Author_Institution :
Central Electron. Eng. Res. Inst., Pilani
Volume :
14
Issue :
6
fYear :
2006
Firstpage :
1912
Lastpage :
1923
Abstract :
This paper proposes a multiresolution model of auditory excitation pattern and applies it to the problem of objective evaluation of subjective wideband speech quality. The model uses wavelet packet transform for time-frequency decomposition of the input signal. The selection of the wavelet packet tree is based on an optimality criterion formulated to minimize a cost function based on the critical band structure. The models of the different auditory phenomena are reformulated for the multiresolution framework. This includes the proposition of duration dependent outer and middle ear weighting, multiresolution spectral spreading, and multiresolution temporal smearing. As an application, the excitation pattern is used to define an objective measure of auditory distortion of a distorted speech signal compared to the undistorted one. The performance of this objective measure is evaluated with a database of various kinds of NOISEX-92 degraded wideband speech signals in predicting the subjective mean opinion score (MOS) and is compared with the fast Fourier transform (FFT)-based ITU-T PESQ P.862.2 algorithm. The proposed measure is found to achieve comparable correlation between subjective MOS and objective MOS as PESQ P.862.2, with a trend suggesting better correlation for the nonstationary degradations compared to the stationary ones. Further refinement of the measure for distortion types other than additive noise is anticipated
Keywords :
fast Fourier transforms; hearing; speech processing; time-frequency analysis; trees (mathematics); wavelet transforms; additive noise; auditory excitation pattern; fast Fourier transform; mean opinion score; multiresolution model; multiresolution temporal smearing; perceived speech quality; subjective wideband speech quality; time-frequency decomposition; wavelet packet transform; wavelet packet tree; Cost function; Degradation; Distortion measurement; Noise measurement; Signal resolution; Speech analysis; Time frequency analysis; Wavelet packets; Wavelet transforms; Wideband; Multiresolution auditory model; NOISEX-92; perceptual evaluation of speech quality (PESQ) P862.2; subjective speech quality estimation; wavelet packet; wideband speech;
fLanguage :
English
Journal_Title :
Audio, Speech, and Language Processing, IEEE Transactions on
Publisher :
ieee
ISSN :
1558-7916
Type :
jour
DOI :
10.1109/TASL.2006.883257
Filename :
1709881
Link To Document :
بازگشت