Title :
Speech Bandwidth Enhancement using State Space Speech Dynamics
Author :
Yao, Sheng ; Chan, Cheung-Fat
Author_Institution :
Dept. of Electron. Eng., City Univ. of Hong Kong
Abstract :
Extending narrowband speech (0-4 kHz) to wideband speech (0-8 kHz) has applications in telephone systems and speech recognition systems where wideband training speech data may not be available. A couple of methods have been proposed to retrieve the missing high-band information (4-8 kHz) from narrowband speech. Memoryless systems are likely to produce large hissing artifacts since mutual information between low-band (0-4 kHz) and high-band (4-8 kHz) spectra are actually quite low. Generally speaking, bandwidth extension cannot recover original high-band information but good approximates with less over-estimation of the high-band energy, which usually refers to hissing artifact, can be obtained by considering the neighboring speech frames. In this paper, we propose a new bandwidth extension system with memory by using a state-space model to capture the long-term speech dynamics. The model parameters can be trained in the sense of maximum likelihood (ML) and the enhancement is obtained via wideband state vector estimation and Kalman filtering. The performance in terms of spectral distortion is shown to be much better than other memoryless systems and is comparable with early continuous density hidden Markov model (CDHMM) memory system. The new state-space method is inherent sequential and has advantages of less processing delays and robustness against block detection errors
Keywords :
Kalman filters; hidden Markov models; maximum likelihood estimation; speech enhancement; speech recognition; state-space methods; 0 to 8 kHz; Kalman filtering; ML; bandwidth extension system; block detection errors; continuous density hidden Markov model memory system; hissing artifact; long-term speech dynamics; maximum likelihood; narrowband speech; spectral distortion; speech bandwidth enhancement; speech recognition systems; state space speech dynamics; telephone systems; wideband speech; wideband state vector estimation; Bandwidth; Hidden Markov models; Maximum likelihood estimation; Memoryless systems; Narrowband; Speech enhancement; Speech recognition; State-space methods; Telephony; Wideband;
Conference_Titel :
Acoustics, Speech and Signal Processing, 2006. ICASSP 2006 Proceedings. 2006 IEEE International Conference on
Conference_Location :
Toulouse
Print_ISBN :
1-4244-0469-X
DOI :
10.1109/ICASSP.2006.1660064