Exciting estimated clean spectra for speech resynthesis

Author

Sreyas Srimath Tirumala;Michael I Mandel

Author_Institution

The Ohio State University, Computer Science &

fYear

2015

Firstpage

Lastpage

Abstract

Spectral masking techniques are prevalent for noise suppression but they damage speech in regions of the spectrum where both noise and speech are present. This paper instead utilizes a recently introduced analysis-by-synthesis technique to estimate the spectral envelope of the speech at all frequencies, and adds to it a model of the speech excitation necessary to fully resynthesize a clean speech signal. Such a resynthesis should have little noise and high quality compared to mask-based approaches. We compare several different excitation signals on the Aurora4 corpus, including those derived from the high quefrency components of the noisy mixture and from the combination of a noise robust pitch tracker and a voiced/unvoiced classifier. Preliminary subjective evaluations suggest that the speech synthesized using our approach has higher voice quality and noise suppression than spectral masking.

Keywords

"Speech","Noise measurement","Mel frequency cepstral coefficient","Noise robustness","Deconvolution","Speech recognition","Speech processing"

Publisher

ieee

Conference_Titel

Applications of Signal Processing to Audio and Acoustics (WASPAA), 2015 IEEE Workshop on

Type

conf

DOI

10.1109/WASPAA.2015.7336907

Filename

7336907

Link To Document

https://search.isc.ac/dl/search/defaultta.aspx?DTC=49&DC=3697424