Title :
Monaural speech segregation and oscillatory correlation
Author_Institution :
Dept. of Comput. & Inf. Sci., Ohio State Univ., Columbus, OH, USA
Abstract :
Summary form only given. Speech segregation from a monaural recording is a primary task of auditory grouping, and has proven to be very challenging. Theoretical and empirical investigations of brain functions point to the mechanism of oscillatory correlation as a plausible framework for perceptual grouping. In this framework, an assembly of synchronized oscillators represents a stream, and oscillator assemblies that desynchronize from one another represent different groups. We describe a multi-stage model for the monaural speech segregation task. The model starts with simulated auditory periphery. A subsequent stage computes mid-level auditory representations, including correlograms and cross-channel correlations. Underlying auditory segmentation and grouping is a neural oscillator network that implements oscillatory correlation. The network encodes proximity in frequency and time, periodicity, and amplitude modulation (AM). Motivated by psychoacoustic observations, our system employs different mechanism to handle resolved and unresolved harmonics. The model has been systematically evaluated, and it yields substantially better performance than previous systems.
Keywords :
correlation methods; harmonics; neural nets; oscillations; speech processing; amplitude modulation; auditory grouping; correlogram; cross-channel correlations; empirical investigations; frequency and time proximity; midlevel auditory representations; monaural recording; monaural speech segregation; neural oscillator network; oscillatory correlation; periodicity; psychoacoustic observations; simulated auditory periphery; theoretical investigation; unresolved harmonics; Amplitude modulation; Assembly; Cognitive science; Computational modeling; Frequency synchronization; Information science; Oscillators; Psychoacoustic models; Psychology; Speech;
Conference_Titel :
Neural Networks, 2003. Proceedings of the International Joint Conference on
Print_ISBN :
0-7803-7898-9
DOI :
10.1109/IJCNN.2003.1223933