Title :
Modeling Human Auditory Perception for Noise-Robust Speech Recognition
Author_Institution :
Professor, Ph. D., Korea Advanced Institute of Science and Technology (KAIST), KOREA, Editor-in-Chief, Neural Information Processing - Letters & Reviews, Professor, Department of BioSystems, Professor, Division of Electrical Engineering, Department of Ele
Abstract :
Several bio-inspired models of human auditory perception are reported for robust speech recognition in real-world noisy environment. The developed mathematical models of the human auditory pathway are integrated into a speech recognition system, of which 3 components are (1) the nonlinear feature extraction model from cochlea to auditory cortex, (2) the binaural processing model at superior olivery complex, and (3) the top-down attention model from higher brain to the cochlea. The unsupervised Independent Component Analysis shows that some auditory feature extraction and binaural processing mechanisms follow information theory with sparse representation. The ICA-based features resemble frequency-limited features extracted from the cochlea and also more complex time-frequency features from the inferior colliculus and auditory cortex. The top-down attention model shows how the pre-acquired knowledge in our brain filters out irrelevant features or fills in missing features in the sensory data. Both the top-down attention and bottom-up binaural processing are combined into a single system for high-noisy cases. This auditory model requires extensive computing, and several VLSI implementations had been developed for real-time applications. Experimental results demonstrate much better recognition performance in real-world noisy environments.
Keywords :
Brain modeling; Feature extraction; Frequency; Humans; Independent component analysis; Information theory; Mathematical model; Noise robustness; Speech recognition; Working environment noise;
Conference_Titel :
Neural Networks and Brain, 2005. ICNN&B '05. International Conference on
Print_ISBN :
0-7803-9422-4
DOI :
10.1109/ICNNB.2005.1614702