DocumentCode
2351497
Title
Audio-visual event detection using duration dependent input output Markov models
Author
Naphade, Milind R. ; Garg, Ashutosh ; Huang, Thomas S.
Author_Institution
IBM Thomas J. Watson Res. Center, Hawthorne, NY, USA
fYear
2001
fDate
2001
Firstpage
39
Lastpage
43
Abstract
Analysis of audio-visual data and detection of semantic events with spatio-temporal support is a challenging multimedia understanding problem. The difficulty lies in the gap that exists between low level media features and high level semantic concept. We introduce a duration dependent input output Markov model (DDIOMM) to detect events based on multiple modalities. The DDIOMM combines the ability to model non-exponential duration densities with the mapping of input sequences to output sequences. We test the DDIOMM by modelling the audio-visual event explosion. We compare the detection performance of the DDIOMM with the IOMM as well as the HMM. Experiments reveal that modeling of duration improves detection performance
Keywords
Markov processes; audio-visual systems; feature extraction; multimedia systems; DDIOMM; HMM; audio-visual data analysis; audio-visual event detection; audio-visual event explosion; detection performance; duration dependent input output Markov models; high level semantic concept; input sequences; low level media features; multimedia understanding problem; multiple modalities; nonexponential duration densities; output sequences; semantic event detection; spatio-temporal support; Bayesian methods; Data analysis; Data mining; Event detection; Explosions; Fellows; Hidden Markov models; Motion pictures; Streaming media; Testing;
fLanguage
English
Publisher
ieee
Conference_Titel
Content-Based Access of Image and Video Libraries, 2001. (CBAIVL 2001). IEEE Workshop on
Conference_Location
Kauai, HI
Print_ISBN
0-7695-1354-9
Type
conf
DOI
10.1109/IVL.2001.990854
Filename
990854
Link To Document