DocumentCode :
3143722
Title :
Introduction of speech log-spectral priors into dereverberation based on Itakura-Saito distance minimization
Author :
Iwata, Yasuaki ; Nakatani, Tomohiro
Author_Institution :
Grad. Sch. of Inf. Sci., Nagoya Univ., Nagoya, Japan
fYear :
2012
fDate :
25-30 March 2012
Firstpage :
245
Lastpage :
248
Abstract :
It has recently been shown that a multi-channel linear prediction can effectively achieve blind speech dereverberation based on maximum-likelihood (ML) estimation. This approach can estimate and cancel unknown reverberation processes from only a few seconds of observation. However, one problem with this approach is that speech distortion may increase if we iterate the dereverberation more than once based on Itakura-Saito (IS) distance minimization to further reduce the reverberation. To overcome this problem, we introduce speech log-spectral priors into this approach, and reformulate it based on maximum a posteriori (MAP) estimation. Two types of priors are introduced, a Gaussian mixture model (GMM) of speech log spectra, and a GMM of speech mel-frequency cepstral coefficients. In the formulation, we also propose a new versatile technique to integrate such log-spectral priors with the IS distance minimization in a computationally efficient manner. Preliminary experiments show the effectiveness of the proposed approach.
Keywords :
Gaussian processes; maximum likelihood estimation; minimisation; prediction theory; reverberation; speech processing; GMM; Gaussian mixture model; IS distance minimization; Itakura-Saito distance minimization; MAP estimation; ML estimation; blind speech dereverberation; maximum a posteriori estimation; maximum-likelihood estimation; multichannel linear prediction; reverberation processing; speech distortion; speech log spectra; speech mel-frequency cepstral coefficient; Estimation; Mel frequency cepstral coefficient; Minimization; Optimization; Speech; Speech enhancement; Vectors; Dereverberation; Gaussian mixture model; Itakura-Saito distance; Maximum a posteriori estimation; probabilistic speech model;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2012 IEEE International Conference on
Conference_Location :
Kyoto
ISSN :
1520-6149
Print_ISBN :
978-1-4673-0045-2
Electronic_ISBN :
1520-6149
Type :
conf
DOI :
10.1109/ICASSP.2012.6287863
Filename :
6287863
Link To Document :
بازگشت