DocumentCode :
52163
Title :
Extended Coding and Pooling in the HMAX Model
Author :
Theriault, Christian ; Thome, Nicolas ; Cord, Matthieu
Author_Institution :
Univ. Pierre et Marie Curie, Paris, France
Volume :
22
Issue :
2
fYear :
2013
fDate :
Feb. 2013
Firstpage :
764
Lastpage :
777
Abstract :
This paper presents an extension of the HMAX model, a neural network model for image classification. The HMAX model can be described as a four-level architecture, with the first level consisting of multiscale and multiorientation local filters. We introduce two main contributions to this model. First, we improve the way the local filters at the first level are integrated into more complex filters at the last level, providing a flexible description of object regions and combining local information of multiple scales and orientations. These new filters are discriminative and yet invariant, two key aspects of visual classification. We evaluate their discriminative power and their level of invariance to geometrical transformations on a synthetic image set. Second, we introduce a multiresolution spatial pooling. This pooling encodes both local and global spatial information to produce discriminative image signatures. Classification results are reported on three image data sets: Caltech101, Caltech256, and fifteen scenes. We show significant improvements over previous architectures using a similar framework.
Keywords :
filtering theory; image classification; image coding; neural nets; Caltech101; Caltech256; HMAX model; discriminative image signatures; discriminative power; extended coding; four-level architecture; geometrical transformations; image classification; image data sets; multiorientation local filters; multiresolution spatial pooling; multiscale local filters; neural network model; object region flexible description; synthetic image set; visual classification; Biological system modeling; Brain modeling; Convolution; Equations; Prototypes; Training; Visualization; Convolutional network; multiscale; object recognition; spatial pooling; vision;
fLanguage :
English
Journal_Title :
Image Processing, IEEE Transactions on
Publisher :
ieee
ISSN :
1057-7149
Type :
jour
DOI :
10.1109/TIP.2012.2222900
Filename :
6324437
Link To Document :
بازگشت