Title :
Temporal Video Segmentation to Scenes Using High-Level Audiovisual Features
Author :
Sidiropoulos, Panagiotis ; Mezaris, Vasileios ; Kompatsiaris, Ioannis ; Meinedo, Hugo ; Bugalho, Miguel ; Trancoso, Isabel
Author_Institution :
Center for Res. & Technol. Hellas, Inf. & Telematics Inst., Thessaloniki, Greece
Abstract :
In this paper, a novel approach to video temporal decomposition into semantic units, termed scenes, is presented. In contrast to previous temporal segmentation approaches that employ mostly low-level visual or audiovisual features, we introduce a technique that jointly exploits low-level and high-level features automatically extracted from the visual and the auditory channel. This technique is built upon the well-known method of the scene transition graph (STG), first by introducing a new STG approximation that features reduced computational cost, and then by extending the unimodal STG-based temporal segmentation technique to a method for multimodal scene segmentation. The latter exploits, among others, the results of a large number of TRECVID-type trained visual concept detectors and audio event detectors, and is based on a probabilistic merging process that combines multiple individual STGs while at the same time diminishing the need for selecting and fine-tuning several STG construction parameters. The proposed approach is evaluated on three test datasets, comprising TRECVID documentary films, movies, and news-related videos, respectively. The experimental results demonstrate the improved performance of the proposed approach in comparison to other unimodal and multimodal techniques of the relevant literature and highlight the contribution of high-level audiovisual features toward improved video segmentation to scenes.
Keywords :
audio-visual systems; feature extraction; graph theory; image segmentation; merging; natural scenes; object detection; video coding; wireless channels; STG approximation; audio event detectors; audiovisual features; auditory channel; features extraction; multimodal scene segmentation; probabilistic merging process; scene transition graph; semantic units; video temporal decomposition; video temporal segmentation; visual channel; visual concept detectors; Approximation methods; Feature extraction; Histograms; Joining processes; Semantics; Streaming media; Visualization; Audio events; scene transition graph; scenes; video segmentation; visual concepts;
Journal_Title :
Circuits and Systems for Video Technology, IEEE Transactions on
DOI :
10.1109/TCSVT.2011.2138830