DocumentCode :
542254
Title :
Monaural speech segregation based on pitch tracking and amplitude modulation
Author :
Hu, Guoning ; Wang, DeLiang
Author_Institution :
Biophysics Program, The Ohio State University, USA
Volume :
1
fYear :
2002
fDate :
13-17 May 2002
Abstract :
Monaural speech segregation remains a computational challenge for auditory scene analysis (ASA). A major problem for existing computational auditory scene analysis (CASA) systems is their inability to deal with signals in the high-frequency range. Psychoacoustic evidence suggests that different perceptual mechanisms are involved to handle resolved and unresolved harmonics. We propose a system for speech segregation that deals with low-frequency and high-frequency signals differently. For low-frequency signals, our model generates segments based on temporal continuity and cross-channel correlation, and groups them according to periodicity. For high-frequency signals. the model generates segments based on common amplitude modulation (AM) in addition to temporal continuity, and groups them according to AM repetition rates. Underlying the grouping process is a pitch contour that is first estimated from segregated speech based on global pitch and then verified by psychoacoustic constraints. Our system is systematically evaluated, and it yields substantially better performance than previous CASA systems, especially in the high-frequency range.
Keywords :
Frequency modulation; Labeling; Nickel; Psychoacoustic models; Signal to noise ratio; Speech; Speech processing;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech, and Signal Processing (ICASSP), 2002 IEEE International Conference on
Conference_Location :
Orlando, FL, USA
ISSN :
1520-6149
Print_ISBN :
0-7803-7402-9
Type :
conf
DOI :
10.1109/ICASSP.2002.5743777
Filename :
5743777
Link To Document :
بازگشت