Title :
Learning sparse generative models of audiovisual signals
Author :
Monaci, Gianluca ; Sommer, Friedrich T. ; Vandergheynst, Pierre
Author_Institution :
Redwood Center for Theor. Neurosci., Univ. of California Berkeley, Berkeley, CA, USA
Abstract :
This paper presents a novel framework to learn sparse representations for audiovisual signals. An audiovisual signal is modeled as a sparse sum of audiovisual kernels. The kernels are bimodal functions made of synchronous audio and video components that can be positioned independently and arbitrarily in space and time. We design an algorithm capable of learning sets of such audiovisual, synchronous, shift-invariant functions by alternatingly solving a coding and a learning procedure. The proposed methodology is used to learn audiovisual features from a set of bimodal sequences. The basis functions that emerge are audio-video pairs that capture salient data structures.
Keywords :
audio coding; audio-visual systems; learning (artificial intelligence); video coding; audiovisual kernel; audiovisual signal modeling; bimodal function; bimodal sequences; coding procedure; learning sparse generative model; salient data structures; shift invariant functions; sparse representation; synchronous audio components; synchronous video components; Dictionaries; Encoding; Kernel; Matching pursuit algorithms; Signal processing; Signal processing algorithms; Visualization;
Conference_Titel :
Signal Processing Conference, 2008 16th European
Conference_Location :
Lausanne