Title :
A non-local cost aggregation method for stereo matching
Author_Institution :
City Univ. of Hong Kong, Hong Kong, China
Abstract :
Matching cost aggregation is one of the oldest and still popular methods for stereo correspondence. While effective and efficient, cost aggregation methods typically aggregate the matching cost by summing/averaging over a user-specified, local support region. This is obviously only locally-optimal, and the computational complexity of the full-kernel implementation usually depends on the region size. In this paper, the cost aggregation problem is re-examined and a non-local solution is proposed. The matching cost values are aggregated adaptively based on pixel similarity on a tree structure derived from the stereo image pair to preserve depth edges. The nodes of this tree are all the image pixels, and the edges are all the edges between the nearest neighboring pixels. The similarity between any two pixels is decided by their shortest distance on the tree. The proposed method is non-local as every node receives supports from all other nodes on the tree. As can be expected, the proposed non-local solution outperforms all local cost aggregation methods on the standard (Middlebury) benchmark. Besides, it has great advantage in extremely low computational complexity: only a total of 2 addition/subtraction operations and 3 multiplication operations are required for each pixel at each disparity level. It is very close to the complexity of unnormalized box filtering using integral image which requires 6 addition/subtraction operations. Unnormalized box filter is the fastest local cost aggregation method but blurs across depth edges. The proposed method was tested on a MacBook Air laptop computer with a 1.8 GHz Intel Core i7 CPU and 4 GB memory. The average runtime on the Middlebury data sets is about 90 milliseconds, and is only about 1.25× slower than unnormalized box filter. A non-local disparity refinement method is also proposed based on the non-local cost aggregation method.
Keywords :
computational complexity; computer vision; image matching; stereo image processing; trees (mathematics); MacBook Air laptop computer; Middlebury benchmark; Middlebury data set; addition operation; computational complexity; computer vision; depth edge preservation; disparity level; full-kernel implementation; image pixel similarity; integral image; matching cost value; multiplication operation; nearest neighboring pixels; nonlocal cost aggregation method; nonlocal disparity refinement method; nonlocal solution; stereo correspondence; stereo image pair; stereo matching; subtraction operation; tree nodes; tree structure; unnormalized box filtering; user-specified local support region; Computational complexity; Image color analysis; Image edge detection; Portable computers; Runtime; Stereo vision;
Conference_Titel :
Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference on
Conference_Location :
Providence, RI
Print_ISBN :
978-1-4673-1226-4
Electronic_ISBN :
1063-6919
DOI :
10.1109/CVPR.2012.6247827