مرکز منطقه ای اطلاع رساني علوم و فناوري - An Enhanced Bag-of-Visual Word Vector Space Model to Represent Visual Content in Athletics Images

DocumentCode :

1341524

Title :

An Enhanced Bag-of-Visual Word Vector Space Model to Represent Visual Content in Athletics Images

Author :

Kesorn, Kraisak ; Poslad, Stefan

Author_Institution :

Comput. Sci. & Inf. Technol. Dept., Naresuan Univ., Phitsanulok, Thailand

Volume :

Issue :

fYear :

2012

Firstpage :

211

Lastpage :

222

Abstract :

Images that have a different visual appearance may be semantically related using a higher level conceptualization. However, image classification and retrieval systems tend to rely only on the low-level visual structure within images. This paper presents a framework to deal with this semantic gap limitation by exploiting the well-known bag-of-visual words (BVW) to represent visual content. The novelty of this paper is threefold. First, the quality of visual words is improved by constructing visual words from representative keypoints. Second, domain specific “non-informative visual words” are detected which are useless to represent the content of visual data but which can degrade the categorization capability. Distinct from existing frameworks, two main characteristics for non-informative visual words are defined: a high document frequency (DF) and a small statistical association with all the concepts in the collection. The third contribution in this paper is that a novel method is used to restructure the vector space model of visual words with respect to a structural ontology model in order to resolve visual synonym and polysemy problems. The experimental results show that our method can disambiguate visual word senses effectively and can significantly improve classification, interpretation, and retrieval performance for the athletics images.

Keywords :

image classification; image representation; image retrieval; natural language processing; ontologies (artificial intelligence); statistical analysis; vectors; athletic image interpretation; categorization capability; document frequency; domain specific noninformative visual word; enhanced bag of visual word vector space model; image classification; image retrieval; noninformative visual word; representative keypoints; semantic gap limitation; statistical association; structural ontology model; visual appearance; visual content; visual data content; visual polysemy problem; visual synonym problem; visual word sense disambiguation; Accuracy; Image retrieval; Noise; Principal component analysis; Semantics; Videos; Visualization; Bag-of-visual words; non-informative visual words discovery; ontology model; visual content representation; visual words disambiguation;

fLanguage :

English

Journal_Title :

Multimedia, IEEE Transactions on

Publisher :

ieee

ISSN :

1520-9210

Type :

jour

DOI :

10.1109/TMM.2011.2170665

Filename :

6035787

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=1341524