Title :
Unified framework for single channel speech enhancement
Author :
Tashev, Ivan ; Lovitt, Andrew ; Acero, Alex
Abstract :
In this paper we describe a generic architecture for single channel speech enhancement. We assume processing in frequency domain and suppression based speech enhancement methods. The framework consists of a two stage voice activity detector, noise variance estimator, a suppression rule, and an uncertain presence of the speech signal modifier. The evaluation corpus is a synthetic mixture of a clean speech (TIMIT database) and in-car recorded noises. Using the framework multiple speech enhancement algorithms are tuned for maximum performance. We propose a formalized procedure for automated tuning of these algorithms. The optimization criterion is a weighted sum of the mean opinion score (PESQ-MOS), signal-to-noise-ratio (SNR), log-spectral distance (LSD), and mean square error (MSE). The proposed framework provides a complete speech enhancement chain and can be used for evaluation and tuning of other suppression rules and voice activity detector algorithms.
Keywords :
frequency-domain analysis; interference suppression; mean square error methods; optimisation; speech enhancement; TIMIT database; frequency domain processing; in-car recorded noises; log-spectral distance; mean opinion score; mean square error; noise variance estimator; optimization criterion; signal-to-noise-ratio; single channel speech enhancement; speech signal modifier; suppression based speech enhancement method; voice activity detector algorithm; Additive noise; Databases; Detectors; Filters; Frequency domain analysis; Mean square error methods; Noise reduction; Signal processing algorithms; Speech analysis; Speech enhancement;
Conference_Titel :
Communications, Computers and Signal Processing, 2009. PacRim 2009. IEEE Pacific Rim Conference on
Conference_Location :
Victoria, BC
Print_ISBN :
978-1-4244-4560-8
Electronic_ISBN :
978-1-4244-4561-5
DOI :
10.1109/PACRIM.2009.5291253