DocumentCode :
3438404
Title :
A temporal saliency map for modeling auditory attention
Author :
Kaya, Emine Merve ; Elhilali, Mounya
Author_Institution :
Dept. of Electr. & Comput. Eng., Johns Hopkins Univ., Baltimore, MD, USA
fYear :
2012
fDate :
21-23 March 2012
Firstpage :
1
Lastpage :
6
Abstract :
The auditory system is flooded with information throughout our daily lives. Rather than processing all of this information, we selectively shift our attention to various auditory events - either events of interest (top-down attention) or events that capture our attention exogenously (bottom-up). In this work, we are concerned with aspects of human attention that are bottom-up stimulus-driven. Saliency of an auditory event is measured by how much the event differs from the surrounding sounds that precede it in time. To calculate this, we propose a novel auditory saliency map that is defined only over time. The proposed model is contrasted against previously published auditory saliency maps which treat the two-dimensional auditory time-frequency spectrogram as an image that can be analyzed using visual saliency models. Instead, our proposed model capitalizes on the rich high-dimensional feature space that defines auditory events; where each acoustic dimension is processed across multiple scales. These normalized feature maps are then combined over time into a single temporal saliency map. The peaks of the temporal saliency map indicate the locations of the salient events in the auditory scene. We validate the accuracy of the proposed model in simulated test scenarios of simple and complex sound clips. By exploiting the unique aspects of auditory processing that cannot be readily captured by visual processes, we are able to outperform other auditory saliency models; all while highlighting the commonalities and differences between the two modalities in processing salient events in everyday scenes.
Keywords :
auditory evoked potentials; brain models; feature extraction; hearing; medical computing; time-frequency analysis; acoustic dimension; auditory attention modeling; auditory events; auditory processing; bottom-up stimulus-driven attention; complex sound clips; high-dimensional feature space; image processing; simple sound clips; temporal saliency map; top-down attention; two-dimensional auditory time-frequency spectrogram; visual processes; visual saliency models; Bandwidth; Computational modeling; Feature extraction; Humans; Spectrogram; Timbre; Time frequency analysis;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Information Sciences and Systems (CISS), 2012 46th Annual Conference on
Conference_Location :
Princeton, NJ
Print_ISBN :
978-1-4673-3139-5
Electronic_ISBN :
978-1-4673-3138-8
Type :
conf
DOI :
10.1109/CISS.2012.6310945
Filename :
6310945
Link To Document :
بازگشت