Author_Institution :
Nat. Lab. of Pattern Recognition, Inst. of Autom., Beijing, China
Abstract :
Recent techniques based on sparse representation (SR) have demonstrated promising performance in high-level visual recognition, exemplified by the highly accurate face recognition under occlusion and other sparse corruptions. Most research in this area has focused on classification algorithms using raw image pixels, and very few have been proposed to utilize the quantized visual features, such as the popular bag-of-words feature abstraction. In such cases, besides the inherent quantization errors, ambiguity associated with visual word assignment and misdetection of feature points, due to factors such as visual occlusions and noises, constitutes the major cause of dense corruptions of the quantized representation. The dense corruptions can jeopardize the decision process by distorting the patterns of the sparse reconstruction coefficients. In this paper, we aim to eliminate the corruptions and achieve robust image analysis with SR. Toward this goal, we introduce two transfer processes (ambiguity transfer and mis-detection transfer) to account for the two major sources of corruption as discussed. By reasonably assuming the rarity of the two kinds of distortion processes, we augment the original SR-based reconstruction objective with mmbl0-norm regularization on the transfer terms to encourage sparsity and, hence, discourage dense distortion/transfer. Computationally, we relax the nonconvex mmbl0-norm optimization into a convex mmbl1-norm optimization problem, and employ the accelerated proximal gradient method to optimize the convergence provable updating procedure. Extensive experiments on four benchmark datasets, Caltech-101, Caltech-256, Corel-5k, and CMU pose, illumination, and expression, manifest the necessity of removing the quantization corruptions and the various advantages of the proposed framework.
Keywords :
image classification; image recognition; image representation; quantisation (signal); CMU pose; Caltech-101; Caltech-256; Corel-5k; SR-based reconstruction objective; ambiguity transfer; bag-of-words feature abstraction; classification algorithms; decision process; feature points misdetection; high-level visual recognition; illumination; lo-norn regularisation; misdetection transfer; quantization corruptions; quantization errors; quantized representation; quantized visual features; robust image analysis; sparse corruptions; sparse reconstruction coefficients; sparse representation; visual occlusions; Feature extraction; Histograms; Noise; Optimization; Quantization; Vectors; Visualization; Image classification; quantized visual feature; sparse representation; Algorithms; Artificial Intelligence; Image Enhancement; Image Interpretation, Computer-Assisted; Pattern Recognition, Automated; Reproducibility of Results; Sensitivity and Specificity;