DocumentCode :
3526113
Title :
Video event detection and summarization using audio, visual and text saliency
Author :
Evangelopoulos, G. ; Zlatintsi, A. ; Skoumas, G. ; Rapantzikos, K. ; Potamianos, A. ; Maragos, P. ; Avrithis, Y.
Author_Institution :
Sch. of ECE, Nat. Tech. Univ. of Athens, Athens
fYear :
2009
fDate :
19-24 April 2009
Firstpage :
3553
Lastpage :
3556
Abstract :
Detection of perceptually important video events is formulated here on the basis of saliency models for the audio, visual and textual information conveyed in a video stream. Audio saliency is assessed by cues that quantify multifrequency waveform modulations, extracted through nonlinear operators and energy tracking. Visual saliency is measured through a spatiotemporal attention model driven by intensity, color and motion. Text saliency is extracted from part-of-speech tagging on the subtitles information available with most movie distributions. The various modality curves are integrated in a single attention curve, where the presence of an event may be signified in one or multiple domains. This multimodal saliency curve is the basis of a bottom-up video summarization algorithm, that refines results from unimodal or audiovisual-based skimming. The algorithm performs favorably for video summarization in terms of informativeness and enjoyability.
Keywords :
audio signal processing; cinematography; frequency modulation; image colour analysis; image motion analysis; object detection; speech recognition; text analysis; video signal processing; video streaming; audio saliency; audiovisual-based skimming; bottom-up video summarization algorithm; energy tracking; movie distribution; multifrequency waveform modulation; multimodal saliency curve; nonlinear operator; part-of-speech tagging; spatiotemporal attention model; speech recognition; text saliency; unimodal skimming; video color; video event detection; video event summarization; video intensity; video motion; video stream; visual saliency; Data mining; Event detection; Frequency; Information analysis; Layout; Motion pictures; Spatiotemporal phenomena; Speech analysis; Streaming media; Tagging; audio; movie summarization; multimodal saliency; text processing; video; video abstraction;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech and Signal Processing, 2009. ICASSP 2009. IEEE International Conference on
Conference_Location :
Taipei
ISSN :
1520-6149
Print_ISBN :
978-1-4244-2353-8
Electronic_ISBN :
1520-6149
Type :
conf
DOI :
10.1109/ICASSP.2009.4960393
Filename :
4960393
Link To Document :
بازگشت