• DocumentCode
    2172521
  • Title

    Stereophonic spectrogram segmentation using Markov random fields

  • Author

    Kim, Minje ; Smaragdis, Paris ; Ko, Glenn G. ; Rutenbar, Rob A.

  • Author_Institution
    Dept. of Comput. Sci., Univ. of Illinois at Urbana-Champaign, Urbana, IL, USA
  • fYear
    2012
  • fDate
    23-26 Sept. 2012
  • Firstpage
    1
  • Lastpage
    6
  • Abstract
    There is a good amount of similarity between source separation approaches that use spectrograms captured from multiple microphones and computer vision algorithms that use multiple images for segmentation problems. Just as one would use Markov random fields (MRF) to solve image segmentation problems, we propose a method of modeling source separation using MRFs, and then solving such problems via common MRF inference methods. To this end, as a preprocessing, we convert stereophonic spectrograms into a integrated form based on their inter-channel level differences (ILD), which is a procedure analogous to getting a disparity map from stereo images for matching problems. Given the ILD matrix as an observed image, we estimate latent labels which stand for the responsibility of each spectrogram´s time/frequency bin to a specific sound source. It is shown that the proposed method shows reasonable separation performance in a variety of mixing environments including online separation and moving sources. We expect this new way of formulating source separation problems to help exploit advantages of probabilistic graphical models and the recent advances in low-power, high-performance hardware suited for such tasks.
  • Keywords
    Markov processes; acoustic generators; blind source separation; computer vision; image matching; image sampling; image segmentation; inference mechanisms; matrix algebra; source separation; stereo image processing; time-frequency analysis; ILD matrix; MRF; Markov random fields; common MRF inference methods; computer vision algorithms; disparity map; image matching problems; interchannel level differences; low-power high-performance hardware; mixing environments; moving sources; multiple images segmentation problems; multiple microphones; observed image; probabilistic graphical models; reasonable separation performance; source separation modeling; spectrogram time-frequency bin; stereo images; stereophonic spectrogram segmentation; Labeling; Markov processes; Microphones; Noise; Source separation; Spectrogram; Time frequency analysis; Blind Source Separation; Gibbs Sampling; Markov Random Fields; Probabilistic Graphical Model;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Machine Learning for Signal Processing (MLSP), 2012 IEEE International Workshop on
  • Conference_Location
    Santander
  • ISSN
    1551-2541
  • Print_ISBN
    978-1-4673-1024-6
  • Electronic_ISBN
    1551-2541
  • Type

    conf

  • DOI
    10.1109/MLSP.2012.6349754
  • Filename
    6349754