DocumentCode :
2208164
Title :
Scene variability and perception constancy in the visual system: a model of pre-processing before data analysis and learning
Author :
Hérault, J. ; Guyader, N. ; Guérin-Dugué, A.
Author_Institution :
GIPSA-Lab., Inst. Nat. Polytech. of Grenoble, Grenoble, France
fYear :
2009
fDate :
1-4 Sept. 2009
Firstpage :
1
Lastpage :
12
Abstract :
Hell in data analysis is paved (at least) with variability and noise. Is there some lost garden of Eden? Is there some way to approach it? In this paper, we deal with the human visual perception and we show how our visual system manages to process visual information in such a highly efficient way that it is able to categorize images or scenes within ranges of 100-150 ms independently of variability and noise. In fact, before high-level recognition task, the visual system unfolds a series of pre-processing stages to reduce all the variability that disturbed an image : (1) In the retina: a first adaptative process to global and local illuminant intensity and color, and a second to local contrasts preprocess the visual signal to bring equal quantity of information on the whole image. The retina spatio-temporal filter whitens the image spectra so that all the spatial frequencies are equally represented. (2) In the primary visual cortex: a bank of cortical like filters decomposes the power frequency spectrum of the visual signal; this is equivalent to estimate the local power frequency spectrum, which is relatively insensitive to image translations. Moreover such decomposition offers a log polar representation of the power spectrum, which can be useful to resolve zoom and rotation variability and also to estimate local perspective. (3) In the cortical area V4, a further Fourier Transform of the log-polar spectrum provides insensitivity to zooms and rotations, as well as to perspective transformations. In this research, an image is represented by means of a high-dimensional vector: first this image is preprocessed by the retina, then, the power spectrum of the preprocessed image is decomposed by a bank of filters and the energy output of each filter is considered. Taking advantages of all the processing stages, data variability can be expected to be optimally reduced for comparison purposes and or for categorization tasks. In the last part of this paper, we give an exampl- e of image categorization by means of CCA, a self-organizing neural network, which, after noise reduction, reduces dimension and unfolds the manifold where the data are embedded. It is also shown that the network, taught by human subjects performing the categorization task, improves its results by providing a better separation of the semantical categories.
Keywords :
Fourier transforms; channel bank filters; computer vision; data analysis; image classification; image colour analysis; image representation; learning (artificial intelligence); spatiotemporal phenomena; visual perception; CCA; Fourier transform; cortical area V4; data analysis; dimension reduction; high-dimensional vector; high-level recognition task; human visual perception system; illuminant intensity; image categorization; image preprocessing; image representation; image spectra; log polar representation; machine learning; noise reduction; perception constancy; power frequency spectrum estimation; retina; retina spatio-temporal filter; rotation variability; scene variability; self-organizing neural network; visual cortex; visual signal preprocessing; visual system; Colored noise; Data analysis; Filters; Frequency estimation; Humans; Image recognition; Layout; Retina; Visual perception; Visual system;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Machine Learning for Signal Processing, 2009. MLSP 2009. IEEE International Workshop on
Conference_Location :
Grenoble
Print_ISBN :
978-1-4244-4947-7
Electronic_ISBN :
978-1-4244-4948-4
Type :
conf
DOI :
10.1109/MLSP.2009.5306254
Filename :
5306254
Link To Document :
بازگشت