• DocumentCode
    2351497
  • Title

    Audio-visual event detection using duration dependent input output Markov models

  • Author

    Naphade, Milind R. ; Garg, Ashutosh ; Huang, Thomas S.

  • Author_Institution
    IBM Thomas J. Watson Res. Center, Hawthorne, NY, USA
  • fYear
    2001
  • fDate
    2001
  • Firstpage
    39
  • Lastpage
    43
  • Abstract
    Analysis of audio-visual data and detection of semantic events with spatio-temporal support is a challenging multimedia understanding problem. The difficulty lies in the gap that exists between low level media features and high level semantic concept. We introduce a duration dependent input output Markov model (DDIOMM) to detect events based on multiple modalities. The DDIOMM combines the ability to model non-exponential duration densities with the mapping of input sequences to output sequences. We test the DDIOMM by modelling the audio-visual event explosion. We compare the detection performance of the DDIOMM with the IOMM as well as the HMM. Experiments reveal that modeling of duration improves detection performance
  • Keywords
    Markov processes; audio-visual systems; feature extraction; multimedia systems; DDIOMM; HMM; audio-visual data analysis; audio-visual event detection; audio-visual event explosion; detection performance; duration dependent input output Markov models; high level semantic concept; input sequences; low level media features; multimedia understanding problem; multiple modalities; nonexponential duration densities; output sequences; semantic event detection; spatio-temporal support; Bayesian methods; Data analysis; Data mining; Event detection; Explosions; Fellows; Hidden Markov models; Motion pictures; Streaming media; Testing;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Content-Based Access of Image and Video Libraries, 2001. (CBAIVL 2001). IEEE Workshop on
  • Conference_Location
    Kauai, HI
  • Print_ISBN
    0-7695-1354-9
  • Type

    conf

  • DOI
    10.1109/IVL.2001.990854
  • Filename
    990854