Title :
An extended model for speech segregation
Author :
Hu, Guoning ; Wang, DeLiang
Author_Institution :
Biophys. Program, Ohio State Univ., Columbus, OH, USA
Abstract :
Speech segregation is an important task of auditory scene analysis (ASA), in which the speech of a certain speaker is separated from other interfering signals. Wang and Brown (1999) proposed a multistage neural model for speech segregation, the core of which is a two-layer oscillator network. We extend their model by adding further processes based on psychoacoustic evidence to improve the performance. These processes include estimation of the pitch of target speech and refined generation of a target speech stream with the estimated pitch. Our model is systematically evaluated and compared with the Wang-Brown model, and it yields significantly better performance
Keywords :
neural nets; physiological models; speech processing; speech synthesis; auditory scene analysis; extended model; multistage neural model; pitch estimation; psychoacoustic evidence; speech segregation; two-layer oscillator network; Automatic speech recognition; Image analysis; Oscillators; Psychoacoustic models; Psychology; Speech analysis; Speech enhancement; Speech processing; Speech synthesis; Wideband;
Conference_Titel :
Neural Networks, 2001. Proceedings. IJCNN '01. International Joint Conference on
Conference_Location :
Washington, DC
Print_ISBN :
0-7803-7044-9
DOI :
10.1109/IJCNN.2001.939512