مرکز منطقه ای اطلاع رساني علوم و فناوري - Monaural voiced speech segregation based on combined cues and energy distribution

DocumentCode :

2021065

Title :

Monaural voiced speech segregation based on combined cues and energy distribution

Author :

Zhao, Liheng ; Wang, Zengfu

Author_Institution :

Dept. of Autom., Univ. of Sci. & Technol. of China, Hefei, China

fYear :

2010

fDate :

23-25 Nov. 2010

Firstpage :

Lastpage :

Abstract :

Monaural speech segregation is important for speech signal processing, and it has been extensively studied on the basis of auditory scene analysis principles. However, current segregation algorithms can not achieve satisfactory performance in high frequency range. In this paper, we propose a system for monaural voiced speech segregation, in which two novel ideas are investigated. First, combined cues (including cross-channel correlation, temporal continuity, and onset/offset) are employed to generate segments in high frequency range. Second, the energy distribution of mixed signal is employed to indicate the reliabilities of cues in high frequency range, according to which, an alternative segmentation strategy is performed. Systematic evaluation and comparison show that the proposed system produces improvement on SNR gain.

Keywords :

speech processing; SNR gain; auditory scene analysis; cues distribution; energy distribution; monaural voiced speech segregation algorithm; speech signal processing; systematic evaluation; Correlation; Erbium; Signal to noise ratio; Speech; Speech processing; Wideband;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Audio Language and Image Processing (ICALIP), 2010 International Conference on

Conference_Location :

Shanghai

Print_ISBN :

978-1-4244-5856-1

Type :

conf

DOI :

10.1109/ICALIP.2010.5685014

Filename :

5685014

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2021065