DocumentCode :
448836
Title :
Cues extraction and hierarchical HMM based events inference in soccer video
Author :
Jin, Guoying ; Tao, Linmi ; Xu, Guangyou
Author_Institution :
Dept. of Comput. Sci. & Technol., Tsinghua Univ., Beijing, China
fYear :
2005
fDate :
Nov. 30 2005-Dec. 1 2005
Firstpage :
73
Lastpage :
76
Abstract :
This paper proposed a systematical framework to address the essential problem, the semantic gap between extractable lowlevel features and meaningful high-level semantics, in content-based retrieval. Low-level features, which can be directly extracted from video streams, are color histogram, inter-frame differences, edges, etc. Theoretically, it is possible to detect events from these features based on hidden Markov models (HMMs) or dynamical Bayesian networks (DBNs), but, in practice, the models are too complicate to be built and to be trained. This paper proposed to employ cues as a middle stone to bridge the gap between the low-level features and the highlevel events. Cues have two salient characteristics: they hold causality with the events, and they can be deduced from features or extracted from video streams. Based on this idea, a systematical framework is constructed to analysis soccer videos, which is selected as a test bed for the fact that events (i.e. shoot event, foul event, and normal process) can be clearly defined in soccer game. First of all, the input video stream is segmented to shots based on the features directly extracted from videos; secondly, se-mantic cues, such as slow motion replay, face of player, caption, goalmouth, shot frequency, etc., are deduced or extracted from the shots; thirdly, three HMMs are built and trained to infer the three events from cues. In general video streams contain more than one event, thus an unavoidable problem is shots should be group to sets, in which there is only one event, for HMM-based events inference. In other words, shots should be appropriately grouped into sets of shots, so that the input observation sequences (a set of shots) fed into HMMs fit at least one of the models. Due to this self-enwound control structure, a hierarchical HMM (HHMM) is employed to group shots and to recognize events simultaneously in video stream. The experiments show the system is effective and robust in inferring events from ro- - ughly deduced or extracted cues.
Keywords :
content-based retrieval; feature extraction; hidden Markov models; image colour analysis; inference mechanisms; video signal processing; video streaming; DBN; HHMM; color histogram; content-based retrieval; cues extraction; dynamical Bayesian networks; hidden Markov models; hierarchical HMM based events inference; high-level semantics; interframe differences; low-level features; observation sequences; self-enwound control structure; soccer video; video streams;
fLanguage :
English
Publisher :
iet
Conference_Titel :
Integration of Knowledge, Semantics and Digital Media Technology, 2005. EWIMT 2005. The 2nd European Workshop on the (Ref. No. 2005/11099)
Conference_Location :
London
ISSN :
0537-9989
Print_ISBN :
0-86341-595-4
Type :
conf
Filename :
1575955
Link To Document :
بازگشت