DocumentCode :
1214115
Title :
A multiresolution manifold distance for invariant image similarity
Author :
Vasconcelos, Nuno ; Lippman, Andrew
Author_Institution :
Dept. of Electr. & Comput. Eng., Univ. of California, La Jolla, CA, USA
Volume :
7
Issue :
1
fYear :
2005
Firstpage :
127
Lastpage :
142
Abstract :
Accounting for spatial image transformations is a requirement for multimedia problems such as video classification and retrieval, face/object recognition or the creation of image mosaics from video sequences. We analyze a transformation invariant metric recently proposed in the machine learning literature to measure the distance between image manifolds - the tangent distance (TD) - and show that it is closely related to alignment techniques from the motion analysis literature. Exposing these relationships results in benefits for the two domains. On one hand, it allows leveraging on the knowledge acquired in the alignment literature to build better classifiers. On the other, it provides a new interpretation of alignment techniques as one component of a decomposition that has interesting properties for the classification of video. In particular, we embed the TD into a multiresolution framework that makes it significantly less prone to local minima. The new metric - multiresolution tangent distance (MRTD) - can be easily combined with robust estimation procedures, and exhibits significantly higher invariance to image transformations than the TD and the Euclidean distance (ED). For classification, this translates into significant improvements in face recognition accuracy. For video characterization, it leads to a decomposition of image dissimilarity into "differences due to camera motion" plus "differences due to scene activity" that is useful for classification. Experimental results on a movie database indicate that the distance could be used as a basis for the extraction of semantic primitives such as action and romance.
Keywords :
computational geometry; estimation theory; face recognition; feature extraction; image classification; image motion analysis; image retrieval; image segmentation; image sequences; object recognition; video signal processing; Euclidean distance; affine transformations; face recognition; image decomposition; image mosaics; image motion analysis; invariant image similarity; machine learning; movie database; multiresolution manifold distance; multiresolution robust estimation procedures; object recognition; semantic movie classification; spatial image transformations; tangent distance; video characterization; video classification; video sequences; Image analysis; Image motion analysis; Image resolution; Image retrieval; Machine learning; Manifolds; Motion measurement; Object recognition; Spatial resolution; Video sequences;
fLanguage :
English
Journal_Title :
Multimedia, IEEE Transactions on
Publisher :
ieee
ISSN :
1520-9210
Type :
jour
DOI :
10.1109/TMM.2004.840596
Filename :
1386248
Link To Document :
بازگشت