DocumentCode :
3449383
Title :
A fusion scheme of visual and auditory modalities for event detection in sports video
Author :
Xu, Min ; Duan, Ling-Yu ; Xu, Chang-Sheng ; Tian, Qi
Author_Institution :
Inst. for Infocomm Res., Singapore, Singapore
Volume :
3
fYear :
2003
fDate :
6-10 April 2003
Abstract :
We propose an effective fusion scheme of visual and auditory modalities to detect events in sports video. The proposed scheme is built upon semantic shot classification, where we classify video shots into several major or interesting classes, each of which has clear semantic meanings. Among major shot classes we perform classification of the different auditory signal segments (i.e. silence, hitting ball, applause, commentator speech) with the goal of detecting events with strong semantic meaning. For instance, for tennis video, we have identified five interesting events: serve, reserve, ace, return, and score. Since we have developed a unified framework for semantic shot classification in sports videos and a set of audio mid-level representation with supervised learning methods, the proposed fusion scheme can be easily adapted to a new sports game. We are extending this fusion scheme to three additional typical sports videos: basketball, volleyball and soccer. Correctly detected sports video events will greatly facilitate further structural and temporal analysis, such as sports video skimming, table of content, etc.
Keywords :
audio signal processing; image classification; image representation; learning (artificial intelligence); sport; video signal processing; ace; applause; audio mid-level representation; auditory modalities; auditory signal segments; basketball; commentator speech; event detection; fusion scheme; multimedia databases; reserve; return; score; semantic shot classification; serve; silence; soccer; sports video; sports video skimming; structural analysis; supervised learning methods; table of content; temporal analysis; tennis video; video indexing; visual modalities; volleyball; Cameras; Event detection; Games; Gunshot detection systems; Hidden Markov models; Indexing; Speech; Supervised learning; Support vector machine classification; Support vector machines;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03). 2003 IEEE International Conference on
ISSN :
1520-6149
Print_ISBN :
0-7803-7663-3
Type :
conf
DOI :
10.1109/ICASSP.2003.1199139
Filename :
1199139
Link To Document :
بازگشت