DocumentCode :
1749214
Title :
An extended model for speech segregation
Author :
Hu, Guoning ; Wang, DeLiang
Author_Institution :
Biophys. Program, Ohio State Univ., Columbus, OH, USA
Volume :
2
fYear :
2001
fDate :
2001
Firstpage :
1089
Abstract :
Speech segregation is an important task of auditory scene analysis (ASA), in which the speech of a certain speaker is separated from other interfering signals. Wang and Brown (1999) proposed a multistage neural model for speech segregation, the core of which is a two-layer oscillator network. We extend their model by adding further processes based on psychoacoustic evidence to improve the performance. These processes include estimation of the pitch of target speech and refined generation of a target speech stream with the estimated pitch. Our model is systematically evaluated and compared with the Wang-Brown model, and it yields significantly better performance
Keywords :
neural nets; physiological models; speech processing; speech synthesis; auditory scene analysis; extended model; multistage neural model; pitch estimation; psychoacoustic evidence; speech segregation; two-layer oscillator network; Automatic speech recognition; Image analysis; Oscillators; Psychoacoustic models; Psychology; Speech analysis; Speech enhancement; Speech processing; Speech synthesis; Wideband;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Neural Networks, 2001. Proceedings. IJCNN '01. International Joint Conference on
Conference_Location :
Washington, DC
ISSN :
1098-7576
Print_ISBN :
0-7803-7044-9
Type :
conf
DOI :
10.1109/IJCNN.2001.939512
Filename :
939512
Link To Document :
بازگشت