Author_Institution :
Noah.s Ark Lab., Hong Kong Univ. of Sci. & Technol., Hong Kong, China
Abstract :
We now know that good mid-level features can greatly enhance the performance of image classification, but how to efficiently learn the image features is still an open question. In this paper, we present an efficient unsupervised midlevel feature learning approach (MidFea), which only involves simple operations, such as k-means clustering, convolution, pooling, vector quantization, and random projection. We show this simple feature can also achieve good performance in traditional classification task. To further boost the performance, we model the neuron selectivity (NS) principle by building an additional layer over the midlevel features prior to the classifier. The NS-layer learns category-specific neurons in a supervised manner with both bottom-up inference and top-down analysis, and thus supports fast inference for a query image. Through extensive experiments, we demonstrate that this higher level NS-layer notably improves the classification accuracy with our simple MidFea, achieving comparable performances for face recognition, gender classification, age estimation, and object categorization. In particular, our approach runs faster in inference by an order of magnitude than sparse coding-based feature learning methods. As a conclusion, we argue that not only do carefully learned features (MidFea) bring improved performance, but also a sophisticated mechanism (NS-layer) at higher level boosts the performance further.
Keywords :
feature extraction; image classification; image coding; inference mechanisms; unsupervised learning; MidFea; NS layer; age estimation; bottom-up inference; category specific neuron; face recognition; gender classification; image classification; neuron selectivity principle; object categorization; query image; sparse coding-based feature learning method; top-down analysis; unsupervised midlevel feature learning approach; Convolution; Dictionaries; Encoding; Feature extraction; Image coding; Neurons; Three-dimensional displays; Feature Learning; Image Classification; Mid-Level Feature; Mid-level feature; Neuron Selectivity; Structural Sparse Coding; feature learning; image classification; neuron selectivity; structural sparse coding;