• DocumentCode
    1463553
  • Title

    Content-based video parsing and indexing based on audio-visual interaction

  • Author

    Tsekeridou, Sofia ; Pitas, Ioannis

  • Author_Institution
    Dept. of Inf., Aristotelian Univ. of Thessaloniki, Greece
  • Volume
    11
  • Issue
    4
  • fYear
    2001
  • fDate
    4/1/2001 12:00:00 AM
  • Firstpage
    522
  • Lastpage
    535
  • Abstract
    A content-based video parsing and indexing method is presented in this paper, which analyzes both information sources (auditory and visual) and accounts for their inter-relations and synergy to extract high-level semantic information. Both frame- and object-based access to the visual information is employed. The aim of the method is to extract semantically meaningful video scenes and assign semantic label(s) to them. Due to the temporal nature of video, time has to be accounted for. Thus, time-constrained video representations and indices are generated. The current approach searches for specific types of content information relevant to the presence or absence of speakers or persons. Audio-source parsing and indexing leads to the extraction of a speaker label mapping of the source over time. Video-source parsing and indexing results in the extraction of a talking-face shot mapping over time. Integration of the audio and visual mappings constrained by interaction rules leads to higher levels of video abstraction and even partial detection of its context
  • Keywords
    audio-visual systems; content-based retrieval; database indexing; feature extraction; image representation; video databases; video signal processing; audio mapping; audio-source indexing; audio-source parsing; audio-visual interaction; content information; content-based video indexing; content-based video parsing; frame-based access; high-level semantic information; information sources; interaction rules; object-based access; partial video detection; semantic labels; speaker label source mapping; talking-face shot mapping; time-constrained video indices; time-constrained video representations; video abstraction; video scenes; visual information; visual mapping; Content based retrieval; Data mining; ISO standards; Indexing; Information retrieval; Layout; MPEG 4 Standard; MPEG 7 Standard; Multimedia systems; Software libraries;
  • fLanguage
    English
  • Journal_Title
    Circuits and Systems for Video Technology, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1051-8215
  • Type

    jour

  • DOI
    10.1109/76.915358
  • Filename
    915358