مرکز منطقه ای اطلاع رساني علوم و فناوري - Exploiting Semantic and Visual Context for Effective Video Annotation

DocumentCode :

82318

Title :

Exploiting Semantic and Visual Context for Effective Video Annotation

Author :

Jian Yi ; Yuxin Peng ; Jianguo Xiao

Author_Institution :

Inst. of Comput. Sci. & Technol., Peking Univ., Beijing, China

Volume :

Issue :

fYear :

2013

fDate :

Oct. 2013

Firstpage :

1400

Lastpage :

1414

Abstract :

We propose a new method to refine the result of video annotation by exploiting the semantic and visual context of video. On one hand, semantic context mining is performed in a supervised way, using the manual concept labels of the training set. It is very useful for boosting video annotation performance, because semantic context is learned from labels given by people, indicating human intention. In this paper, we model the spatial and temporal context in video by using conditional random fields with different structures. Comparing with existing methods, our method could more accurately capture concept relationship in video and could more effectively improve the video annotation performance. On the other hand, visual context mining is performed in a semi-supervised way based on the visual similarities among video shots. It indicates the natural visual property of video, and could be considered as the compensation to semantic context, which generally could not be perfectly modeled. In this paper, we construct a graph based on the visual similarities among shots. Then a semi-supervised learning approach is adopt based on the graph to propagate probabilities of the reliable shots to others having similar visual features with them. Extensive experimental results on the widely used TRECVID datasets exhibit the effectiveness of our method for improving video annotation accuracy.

Keywords :

data mining; learning (artificial intelligence); random processes; video retrieval; video signal processing; TRECVID datasets; conditional random fields; natural visual property; semantic context learning; semantic context mining; semi-supervised learning approach; spatial context; temporal context; training set; video annotation performance; video shots; visual context mining; visual features; visual similarities; Accuracy; Context; Context modeling; Humans; Reliability; Semantics; Visualization; Context mining; semantic context; video annotation; video retrieval; visual context;

fLanguage :

English

Journal_Title :

Multimedia, IEEE Transactions on

Publisher :

ieee

ISSN :

1520-9210

Type :

jour

DOI :

10.1109/TMM.2013.2250266

Filename :

6475188

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=82318