• DocumentCode
    1178556
  • Title

    The fundamental limitation of frequency domain blind source separation for convolutive mixtures of speech

  • Author

    Araki, Shoko ; Mukai, Ryo ; Makino, Shoji ; Nishikawa, Tsuyoki ; Saruwatari, Hiroshi

  • Author_Institution
    NTT Commun. Sci. Labs., NTT Corp., Kyoto, Japan
  • Volume
    11
  • Issue
    2
  • fYear
    2003
  • fDate
    3/1/2003 12:00:00 AM
  • Firstpage
    109
  • Lastpage
    116
  • Abstract
    Despite several recent proposals to achieve blind source separation (BSS) for realistic acoustic signals, the separation performance is still not good enough. In particular, when the impulse responses are long, performance is highly limited. In this paper, we consider a two-input, two-output convolutive BSS problem. First, we show that it is not good to be constrained by the condition T>P, where T is the frame length of the DFT and P is the length of the room impulse responses. We show that there is an optimum frame size that is determined by the trade-off between maintaining the number of samples in each frequency bin to estimate statistics and covering the whole reverberation. We also clarify the reason for the poor performance of BSS in long reverberant environments, highlighting that the framework of BSS works as two sets of frequency-domain adaptive beamformers. Although BSS can reduce reverberant sounds to some extent like adaptive beamformers, they mainly remove the sounds from the jammer direction. This is the reason for the difficulty of BSS in reverberant environments.
  • Keywords
    blind source separation; frequency-domain analysis; interference suppression; reverberation; speech processing; convolutive mixtures; frequency domain blind source separation; frequency-domain adaptive beamformers; impulse responses; jammer; optimum frame size; realistic acoustic signals; reverberation; speech; statistics; two-input two-output convolutive BSS problem; Blind source separation; Finite impulse response filter; Frequency domain analysis; Frequency estimation; Jamming; Proposals; Reverberation; Source separation; Speech; Statistics;
  • fLanguage
    English
  • Journal_Title
    Speech and Audio Processing, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1063-6676
  • Type

    jour

  • DOI
    10.1109/TSA.2003.809193
  • Filename
    1193577