DocumentCode
1340725
Title
Onsets Coincidence for Cross-Modal Analysis
Author
Barzelay, Zohar ; Schechner, Yoav Y.
Author_Institution
Dept. of Electr. Eng., Technion - Israel Inst. of Technol., Haifa, Israel
Volume
12
Issue
2
fYear
2010
Firstpage
108
Lastpage
120
Abstract
Cross-modal analysis offers information beyond that extracted from individual modalities. Consider a nontrivial scene, that includes several moving visual objects, some of which emit sounds. The scene is sensed by a camcorder having a single microphone. A task for audio-visual analysis is to assess the number of independent audio-associated visual objects (AVOs), pinpoint the AVOs´ spatial locations in the video and isolate each corresponding audio component. We describe an approach that helps handle this challenge. The approach does not inspect the low-level data. Rather, it acknowledges the importance of mid-level features in each modality, which are based on significant temporal changes in each modality. A probabilistic formalism identifies temporal coincidences between these features, yielding cross-modal association and visual localization. This association is further utilized in order to isolate sounds that correspond to each of the localized visual features. This is of particular benefit in harmonic sounds, as it enables subsequent isolation of each audio source. We demonstrate this approach in challenging experiments. In these experiments, multiple objects move simultaneously, creating motion distractions for one another, and produce simultaneous sounds which mix.
Keywords
audio signal processing; computer vision; probability; sensor fusion; audio-associated visual objects; audio-visual analysis; camcorder; cross-modal analysis; harmonic sounds; machine vision; microphone; multisensor fusion; onsets coincidence; probabilistic formalism; Correlators; cross-sensor fusion; machine vision; multimodal analysis; multisensor fusion;
fLanguage
English
Journal_Title
Multimedia, IEEE Transactions on
Publisher
ieee
ISSN
1520-9210
Type
jour
DOI
10.1109/TMM.2009.2037387
Filename
5340552
Link To Document