Title :
Scale-Invariant Visual Language Modeling for Object Categorization
Author :
Wu, Lei ; Hu, Yang ; Li, Mingjing ; Yu, Nenghai ; Hua, Xian-Sheng
Author_Institution :
Dept. of Electr. Eng. & Inf. Sci., Univ. of Sci. of Technol. of China, Hefei
Abstract :
In recent years, ldquobag-of-wordsrdquo models, which treat an image as a collection of unordered visual words, have been widely applied in the multimedia and computer vision fields. However, their ignorance of the spatial structure among visual words makes them indiscriminative for objects with similar word frequencies but different word spatial distributions. In this paper, we propose a visual language modeling method (VLM), which incorporates the spatial context of the local appearance features into the statistical language model. To represent the object categories, models with different orders of statistical dependencies have been exploited. In addition, the multilayer extension to the VLM makes it more resistant to scale variations of objects. The model is effective and applicable to large scale image categorization. We train scale invariant visual language models based on the images which are grouped by Flickr tags, and use these models for object categorization. Experimental results show they achieve better performance than single layer visual language models and ldquobag-of-wordsrdquo models. They also achieve comparable performance with 2-D MHMM and SVM-based methods, while costing much less computational time.
Keywords :
computer vision; image retrieval; visual languages; computer vision; large scale image categorization; object categorization; scale-invariant visual language modeling; statistical language model; unordered visual words; Computer vision; Content based retrieval; Context modeling; Costing; Data mining; Frequency; Image analysis; Image retrieval; Large-scale systems; Nonhomogeneous media; Computer vision; content-based image retrieval; image classification; visual language model;
Journal_Title :
Multimedia, IEEE Transactions on
DOI :
10.1109/TMM.2008.2009692