DocumentCode :
2021065
Title :
Monaural voiced speech segregation based on combined cues and energy distribution
Author :
Zhao, Liheng ; Wang, Zengfu
Author_Institution :
Dept. of Autom., Univ. of Sci. & Technol. of China, Hefei, China
fYear :
2010
fDate :
23-25 Nov. 2010
Firstpage :
57
Lastpage :
63
Abstract :
Monaural speech segregation is important for speech signal processing, and it has been extensively studied on the basis of auditory scene analysis principles. However, current segregation algorithms can not achieve satisfactory performance in high frequency range. In this paper, we propose a system for monaural voiced speech segregation, in which two novel ideas are investigated. First, combined cues (including cross-channel correlation, temporal continuity, and onset/offset) are employed to generate segments in high frequency range. Second, the energy distribution of mixed signal is employed to indicate the reliabilities of cues in high frequency range, according to which, an alternative segmentation strategy is performed. Systematic evaluation and comparison show that the proposed system produces improvement on SNR gain.
Keywords :
speech processing; SNR gain; auditory scene analysis; cues distribution; energy distribution; monaural voiced speech segregation algorithm; speech signal processing; systematic evaluation; Correlation; Erbium; Signal to noise ratio; Speech; Speech processing; Wideband;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Audio Language and Image Processing (ICALIP), 2010 International Conference on
Conference_Location :
Shanghai
Print_ISBN :
978-1-4244-5856-1
Type :
conf
DOI :
10.1109/ICALIP.2010.5685014
Filename :
5685014
Link To Document :
بازگشت