• DocumentCode
    1500839
  • Title

    Maximum a Posteriori Binary Mask Estimation for Underdetermined Source Separation Using Smoothed Posteriors

  • Author

    Cobos, Maximo ; Lopez, Jose J.

  • Author_Institution
    Comput. Sci. Dept., Univ. de Valencia, Valencia, Spain
  • Volume
    20
  • Issue
    7
  • fYear
    2012
  • Firstpage
    2059
  • Lastpage
    2064
  • Abstract
    Sound source separation has become a topic of intensive research in the last years. The research effort has been specially relevant for the underdetermined case, where a considerable number of sparse methods working in the time-frequency (T-F) domain have appeared. In this context, although binary masking seems to be a preferred choice for source demixing, the estimated masks differ substantially from the ideal ones. This paper proposes a maximum a posteriori (MAP) framework for binary mask estimation. To this end, class-conditional source probabilities according to the observed mixing parameters are modeled via ratios of dependent Cauchy distributions while source priors are iteratively calculated from the observed histograms. Moreover, spatially smoothed posteriors in the T-F domain are proposed to avoid noisy estimates, showing that the estimated masks are closer to the ideal ones in terms of objective performance measures.
  • Keywords
    blind source separation; maximum likelihood estimation; probability; time-frequency analysis; MAP binary mask estimation; T-F domain; blind source separation; class-conditional source probability; dependent Cauchy distribution; histogram observation; iteratively calculation; maximum a posteriori binary mask estimation; mixing parameter observation; noisy estimation; sound source separation; source demixing; spatially smoothed posterior; time-frequency domain; Direction of arrival estimation; Estimation; Histograms; Indexes; Speech; Speech processing; Time frequency analysis; Blind source separation; sparse models; time–frequency (T-F) masking;
  • fLanguage
    English
  • Journal_Title
    Audio, Speech, and Language Processing, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1558-7916
  • Type

    jour

  • DOI
    10.1109/TASL.2012.2195654
  • Filename
    6188514