• DocumentCode
    3558770
  • Title

    Single-Sensor Audio Source Separation Using Classification and Estimation Approach and GARCH Modeling

  • Author

    Abramson, Ari ; Cohen, Israel

  • Author_Institution
    Dept. of Electr. Eng., Technion - Israel Inst. of Technol., Haifa
  • Volume
    16
  • Issue
    8
  • fYear
    2008
  • Firstpage
    1528
  • Lastpage
    1540
  • Abstract
    In this paper, we propose a new algorithm for single-sensor audio source separation of speech and music signals, which is based on generalized autoregressive conditional heteroscedasticity (GARCH) modeling of the speech signals and Gaussian mixture modeling (GMM) of the music signals. The separation of the speech from the music signal is obtained by a simultaneous classification and estimation approach, which enables one to control the tradeoff between residual interference and signal distortion. Experimental results on mixtures of speech and piano music signals have yielded an improved source separation performance compared to using Gaussian mixture models for both signals. The tradeoff between signal distortion and residual interference is controlled by adjusting some cost parameters, which are shown to determine the missed and false detection rates in the proposed classification and estimation approach.
  • Keywords
    Gaussian processes; audio signal processing; autoregressive processes; blind source separation; signal classification; GARCH modeling; Gaussian mixture modeling; classification approach; estimation approach; generalized autoregressive conditional heteroscedasticity; piano music signals; residual interference; signal distortion; single-sensor audio source separation; speech signals; Background noise; Costs; Distortion; Hidden Markov models; Instruments; Interference; Microphones; Multiple signal classification; Source separation; Speech enhancement; Detection and estimation; Source separation; generalized autoregressive conditional heteroscedasticity (GARCH);
  • fLanguage
    English
  • Journal_Title
    Audio, Speech, and Language Processing, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1558-7916
  • Type

    jour

  • DOI
    10.1109/TASL.2008.2005351
  • Filename
    4648925