• DocumentCode
    1340725
  • Title

    Onsets Coincidence for Cross-Modal Analysis

  • Author

    Barzelay, Zohar ; Schechner, Yoav Y.

  • Author_Institution
    Dept. of Electr. Eng., Technion - Israel Inst. of Technol., Haifa, Israel
  • Volume
    12
  • Issue
    2
  • fYear
    2010
  • Firstpage
    108
  • Lastpage
    120
  • Abstract
    Cross-modal analysis offers information beyond that extracted from individual modalities. Consider a nontrivial scene, that includes several moving visual objects, some of which emit sounds. The scene is sensed by a camcorder having a single microphone. A task for audio-visual analysis is to assess the number of independent audio-associated visual objects (AVOs), pinpoint the AVOs´ spatial locations in the video and isolate each corresponding audio component. We describe an approach that helps handle this challenge. The approach does not inspect the low-level data. Rather, it acknowledges the importance of mid-level features in each modality, which are based on significant temporal changes in each modality. A probabilistic formalism identifies temporal coincidences between these features, yielding cross-modal association and visual localization. This association is further utilized in order to isolate sounds that correspond to each of the localized visual features. This is of particular benefit in harmonic sounds, as it enables subsequent isolation of each audio source. We demonstrate this approach in challenging experiments. In these experiments, multiple objects move simultaneously, creating motion distractions for one another, and produce simultaneous sounds which mix.
  • Keywords
    audio signal processing; computer vision; probability; sensor fusion; audio-associated visual objects; audio-visual analysis; camcorder; cross-modal analysis; harmonic sounds; machine vision; microphone; multisensor fusion; onsets coincidence; probabilistic formalism; Correlators; cross-sensor fusion; machine vision; multimodal analysis; multisensor fusion;
  • fLanguage
    English
  • Journal_Title
    Multimedia, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1520-9210
  • Type

    jour

  • DOI
    10.1109/TMM.2009.2037387
  • Filename
    5340552