DocumentCode
1503241
Title
Separation of speech from interfering sounds based on oscillatory correlation
Author
Wang, DeLiang L. ; Brown, Guy J.
Author_Institution
Dept. of Comput. & Inf. Sci., Ohio State Univ., Columbus, OH, USA
Volume
10
Issue
3
fYear
1999
fDate
5/1/1999 12:00:00 AM
Firstpage
684
Lastpage
697
Abstract
A multistage neural model is proposed for an auditory scene analysis task-segregating speech from interfering sound sources. The core of the model is a two-layer oscillator network that performs stream segregation on the basis of oscillatory correlation. In the oscillatory correlation framework, a stream is represented by a population of synchronized relaxation oscillators, each of which corresponds to an auditory feature, and different streams are represented by desynchronized oscillator populations. Lateral connections between oscillators encode harmonicity, and proximity in frequency and time. Prior to the oscillator network are a model of the auditory periphery and a stage in which mid-level auditory representations are formed. The model has been systematically evaluated using a corpus of voiced speech mixed with interfering sounds, and produces improvements in terms of signal-to-noise ratio for every mixture. A number of issues including biological plausibility and real-time implementation are also discussed
Keywords
correlation methods; harmonic analysis; neural nets; speech coding; speech recognition; auditory scene analysis; encoding; harmonicity; multistage neural model; oscillatory correlation; real-time system; speech segregation; speech signal separation; stream segregation; two-layer oscillator network; Automatic speech recognition; Biological system modeling; Cognitive science; Ear; Frequency; Image analysis; Inference algorithms; Oscillators; Speech analysis; Speech recognition;
fLanguage
English
Journal_Title
Neural Networks, IEEE Transactions on
Publisher
ieee
ISSN
1045-9227
Type
jour
DOI
10.1109/72.761727
Filename
761727
Link To Document