• DocumentCode
    1888259
  • Title

    Speech enhancement based on MAP-log spectral magnitude estimation using the gamma prior of the speech power

  • Author

    Dat, Tran Huy ; Takeda, Kenji ; Itakura, F.

  • Author_Institution
    Nagoya Univ., Japan
  • fYear
    2005
  • fDate
    18-20 May 2005
  • Firstpage
    43
  • Abstract
    Summary form only given. In this work we propose and develop a speech enhancement system based on the MAP estimation on the log-spectral magnitude domain, using the gamma prior modeling of the speech power. The gamma modeling on the power domain is shown to be a natural extension of the conventional Gaussian model of complex speech spectrum and therefore can able to fit better the prior distribution. Furthermore, we propose a MAP estimation method for an arbitrary nonlinear functional domain of the speech spectral magnitude. The noise suppression filtering systems are implemented for the MAP estimation on both the spectral magnitude, the power and the log-spectral magnitude domains. The gamma modeling yields closed form solutions of the estimation and therefore provide a high flexibility with low computational cost of the implemented systems. The experiments are evaluated on the standard AURORA2 data for the segmental SNR and speech recognition measurements. The experiments results show the advantageous use of the gamma model of speech power compared to the conventional Gaussian model. Among the gamma model based methods, the MAP estimation on the log-spectral magnitude performs best with the approximately 3dB improvement of the segmental SNR and 18 percent of the relative WER improvements compared to the conventional methods.
  • Keywords
    digital filters; error statistics; gamma distribution; maximum likelihood estimation; nonlinear functions; signal denoising; spectral analysis; speech enhancement; speech recognition; AURORA2 data; MAP estimation; WER; complex speech spectrum; gamma prior; log-spectral magnitude; noise suppression filtering; nonlinear function; segmental SNR; speech enhancement; speech power; speech recognition; Speech enhancement;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Nonlinear Signal and Image Processing, 2005. NSIP 2005. Abstracts. IEEE-Eurasip
  • Conference_Location
    Sapporo
  • Print_ISBN
    0-7803-9064-4
  • Type

    conf

  • DOI
    10.1109/NSIP.2005.1502302
  • Filename
    1502302