DocumentCode
1037445
Title
Mining Appearance Models Directly From Compressed Video
Author
Chen, Datong ; Liu, Qiang ; Sun, Mingui ; Yang, Jie
Author_Institution
Carnegie Mellon Univ., Pittsburgh
Volume
10
Issue
2
fYear
2008
Firstpage
268
Lastpage
276
Abstract
In this paper, we propose an approach for learning appearance models of moving objects directly from compressed video. The appearance of a moving object changes dynamically in video due to varying object poses, lighting conditions, and partial occlusions. Efficiently mining the appearance models of objects is a crucial and challenging technology to support content-based video coding, clustering, indexing, and retrieval at the object level. The proposed approach learns the appearance models of moving objects in the spatial-temporal dimension of video data by taking advantage of the MPEG video compression format. It detects a moving object and recovers the trajectory of each macroblock covered by the object using the motion vector present in the compressed stream. The appearances are then reconstructed in the DCT domain along the object´s trajectory, and modeled as a mixture of Gaussians (MoG) using DCT coefficients. We prove that, under certain assumptions, the MoG model learned from the DCT domain can achieve pixel-level accuracy when transformed back to the spatial domain, and has a better band-selectivity compared to the MoG model learned in the spatial domain. We finally cluster the MoG models to merge the appearance models of the same object together for object-level content analysis.
Keywords
Gaussian processes; content-based retrieval; data compression; data mining; discrete cosine transforms; image reconstruction; indexing; learning (artificial intelligence); motion estimation; object detection; pattern clustering; video coding; video retrieval; AI learning; DCT; Gaussian process; MPEG; appearance model; content-based video coding; data mining; image reconstruction; motion vector; moving object detection; spatial-temporal dimension; video clustering; video compression; video indexing; video retrieval; Content based retrieval; Discrete cosine transforms; Gaussian processes; Indexing; Motion detection; Object detection; Streaming media; Transform coding; Video coding; Video compression; Compressed video; DCT domain; object appearance modeling; video mining;
fLanguage
English
Journal_Title
Multimedia, IEEE Transactions on
Publisher
ieee
ISSN
1520-9210
Type
jour
DOI
10.1109/TMM.2007.911835
Filename
4432617
Link To Document