• DocumentCode
    112695
  • Title

    Multi-Channel Audio Source Separation Using Multiple Deformed References

  • Author

    Souviraa-Labastie, Nathan ; Olivero, Anaik ; Vincent, Emmanuel ; Bimbot, Frederic

  • Author_Institution
    IRISA, Univ. Rennes 1, Rennes, France
  • Volume
    23
  • Issue
    11
  • fYear
    2015
  • fDate
    Nov. 2015
  • Firstpage
    1775
  • Lastpage
    1787
  • Abstract
    We present a general multi-channel source separation framework where additional audio references are available for one (or more) source(s) of a given mixture. Each audio reference is another mixture which is supposed to contain at least one source similar to one of the target sources. Deformations between the sources of interest and their references are modeled in a linear manner using a generic formulation. This is done by adding transformation matrices to an excitation-filter model, hence affecting different axes, namely frequency, dictionary component or time. A nonnegative matrix co-factorization algorithm and a generalized expectation-maximization algorithm are used to estimate the parameters of the model. Different model parameterizations and different combinations of algorithms are tested on music plus voice mixtures guided by music and/or voice references and on professionally-produced music recordings guided by cover references. Our algorithms improve the signal-to-distortion ratio (SDR) of the sources with the lowest intensity by 9 to 15 decibels (dB) with respect to original mixtures.
  • Keywords
    audio signal processing; expectation-maximisation algorithm; filtering theory; matrix decomposition; source separation; SDR; excitation-filter model; generalized expectation-maximization algorithm; generic formulation; multichannel audio source separation; multiple deformed reference; music plus voice mixture; nonnegative matrix cofactorization algorithm; parameter estimation; signal-to-distortion ratio; sources Deformations; Covariance matrices; Deformable models; Dictionaries; Source separation; Spectrogram; Speech; Speech processing; Generalized Expectation-Maximization (GEM) algorithm; source separation;
  • fLanguage
    English
  • Journal_Title
    Audio, Speech, and Language Processing, IEEE/ACM Transactions on
  • Publisher
    ieee
  • ISSN
    2329-9290
  • Type

    jour

  • DOI
    10.1109/TASLP.2015.2450494
  • Filename
    7138614