A method and browser for cross-referenced video summaries

Author

Aner, A. ; Tang, Lijun ; Kender, John R.

Author_Institution

Dept. of Comput. Sci., Columbia Univ., New York, NY, USA

Volume

2

fYear

2002

fDate

2002

Firstpage

237

Abstract

We present an automatic tool for compact representation and cross-referencing of long video sequences, which is based on a novel visual abstraction of semantic content. Our highly compact hierarchical representation results from the non-temporal clustering of scene segments into a new conceptual form grounded in the recognition of real-world backgrounds. We represent shots and scenes using mosaics and employ a novel method for the comparison of scenes based on these representative mosaics. We then cluster scenes together into a higher level of abstraction-the physical setting. We demonstrate our work using situation comedies (sitcoms), where each half-hour episode is well structured by rules governing background use. Consequently, browsing, indexing and comparison across videos by physical setting is very fast. Further, we show that physical settings lead to a higher-level contextual identification of the main plots in each video. We demonstrate these contributions with a browsing tool whose top-level single page displays the settings of several episodes. This page expands to display windows for each episode, and each episode menu summary is further expanded into scenes and shots, all by mouse-clicking on appropriate plots and settings according to user interests.

Keywords

image representation; image segmentation; image sequences; indexing; information retrieval; pattern clustering; video signal processing; automatic tool; background use; browsing; compact representation; comparison; contextual identification; cross-referenced video summaries; cross-referencing; episode menu summary; half-hour episode; highly compact hierarchical representation; indexing; long video sequences; mosaics; nontemporal clustering; physical setting; plots; real-world backgrounds; recognition; scene segments; semantic content; sitcoms; situation comedies; top-level single page; visual abstraction; Broadcasting; Cameras; Computer science; Displays; Indexing; Layout; Motion pictures; Multimedia communication; Video compression; Video sequences;

fLanguage

English

Publisher

ieee

Conference_Titel

Multimedia and Expo, 2002. ICME '02. Proceedings. 2002 IEEE International Conference on

Print_ISBN

0-7803-7304-9

Type

conf

DOI

10.1109/ICME.2002.1035560

Filename

1035560