• DocumentCode
    788181
  • Title

    A Multiresolution Model of Auditory Excitation Pattern and Its Application to Objective Evaluation of Perceived Speech Quality

  • Author

    Karmakar, Abhijit ; Kumar, Arun ; Patney, R.K.

  • Author_Institution
    Central Electron. Eng. Res. Inst., Pilani
  • Volume
    14
  • Issue
    6
  • fYear
    2006
  • Firstpage
    1912
  • Lastpage
    1923
  • Abstract
    This paper proposes a multiresolution model of auditory excitation pattern and applies it to the problem of objective evaluation of subjective wideband speech quality. The model uses wavelet packet transform for time-frequency decomposition of the input signal. The selection of the wavelet packet tree is based on an optimality criterion formulated to minimize a cost function based on the critical band structure. The models of the different auditory phenomena are reformulated for the multiresolution framework. This includes the proposition of duration dependent outer and middle ear weighting, multiresolution spectral spreading, and multiresolution temporal smearing. As an application, the excitation pattern is used to define an objective measure of auditory distortion of a distorted speech signal compared to the undistorted one. The performance of this objective measure is evaluated with a database of various kinds of NOISEX-92 degraded wideband speech signals in predicting the subjective mean opinion score (MOS) and is compared with the fast Fourier transform (FFT)-based ITU-T PESQ P.862.2 algorithm. The proposed measure is found to achieve comparable correlation between subjective MOS and objective MOS as PESQ P.862.2, with a trend suggesting better correlation for the nonstationary degradations compared to the stationary ones. Further refinement of the measure for distortion types other than additive noise is anticipated
  • Keywords
    fast Fourier transforms; hearing; speech processing; time-frequency analysis; trees (mathematics); wavelet transforms; additive noise; auditory excitation pattern; fast Fourier transform; mean opinion score; multiresolution model; multiresolution temporal smearing; perceived speech quality; subjective wideband speech quality; time-frequency decomposition; wavelet packet transform; wavelet packet tree; Cost function; Degradation; Distortion measurement; Noise measurement; Signal resolution; Speech analysis; Time frequency analysis; Wavelet packets; Wavelet transforms; Wideband; Multiresolution auditory model; NOISEX-92; perceptual evaluation of speech quality (PESQ) P862.2; subjective speech quality estimation; wavelet packet; wideband speech;
  • fLanguage
    English
  • Journal_Title
    Audio, Speech, and Language Processing, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1558-7916
  • Type

    jour

  • DOI
    10.1109/TASL.2006.883257
  • Filename
    1709881