DocumentCode
2323982
Title
Semantic High-Level Features for Automated Cross-Modal Slideshow Generation
Author
Dunker, Peter ; Dittmar, Christian ; Begau, André ; Nowak, Stefanie ; Gruhne, Matthias
Author_Institution
Fraunhofer Inst. for Digital Media Technol., Ilmenau
fYear
2009
fDate
3-5 June 2009
Firstpage
144
Lastpage
149
Abstract
This paper describes a technical solution for automated slideshow generation by extracting a set of high-level features from music, such as beat grid, mood and genre and intelligently combining this set with image high-level features, such as mood, daytime- and scene classification. An advantage of this high-level concept is to enable the user to incorporate his preferences regarding the semantic aspects of music and images. For example, the user might request the system to automatically create a slideshow, which plays soft music and shows pictures with sunsets from the last 10 years of his own photo collection.The high-level feature extraction on both, the audio and the visual information is based on the same underlying machine learning core, which processes different audio- and visual- low- and mid-level features. This paper describes the technical realization and evaluation of the algorithms with suitable test databases.
Keywords
audio signal processing; feature extraction; image classification; music; audio feature; automated cross-modal slideshow generation; image classification; photo collection; play soft music; semantic feature extraction; Feature extraction; Indexing; Layout; Machine learning; Machine learning algorithms; Mesh generation; Mood; Spatial databases; Testing; Visual databases; image and music retrieval; semantic indexing; slideshow generation;
fLanguage
English
Publisher
ieee
Conference_Titel
Content-Based Multimedia Indexing, 2009. CBMI '09. Seventh International Workshop on
Conference_Location
Chania
Print_ISBN
978-1-4244-4265-2
Electronic_ISBN
978-0-7695-3662-0
Type
conf
DOI
10.1109/CBMI.2009.32
Filename
5137832
Link To Document