• DocumentCode
    2801593
  • Title

    Multi-channel source separation by beamforming trained with factorial HMMs

  • Author

    Reyes-Gomez, Manuel J. ; Bhiksha, R. ; Ellis, Daniel P W

  • Author_Institution
    Dept. of Electr. Eng., Columbia Univ., New York, NY, USA
  • fYear
    2003
  • fDate
    19-22 Oct. 2003
  • Firstpage
    13
  • Lastpage
    16
  • Abstract
    Speaker separation has conventionally been treated as a problem of blind source separation (BSS). This approach does not utilize any knowledge of the statistical characteristics of the signals to be separated, relying mainly on the independence between the various signals to separate them. Maximum-likelihood techniques, on the other hand, utilize knowledge of the a priori probability distributions of the signals from the speakers, in order to effect separation. Previously (Reyes-Gomez, M.J. et al., Proc. ICASSP, 2003), we presented a maximum-likelihood speaker separation technique that utilizes detailed statistical information about the signals to be separated, represented in the form of hidden Markov models (HMMs), to estimate the parameters of a filter-and-sum processor for signal separation. We show that the filters that are estimated for a particular utterance by a speaker generalize well to other utterances by the same speaker, provided the location of the various speakers remains constant. Thus, filters that have been estimated using a "training" utterance of a known transcript can be used to separate all future signals by the speaker from mixtures of speech signals in an unsupervised manner. On the other hand, the filters are ineffective for other speakers, even at the same locations, indicating that they capture the spatio-frequency characteristics of the speaker.
  • Keywords
    array signal processing; audio signal processing; filtering theory; hidden Markov models; maximum likelihood estimation; source separation; statistical analysis; BSS; HMM; a priori probability distributions; beamforming; blind source separation; filter-and-sum processor; hidden Markov models; maximum-likelihood techniques; multi-channel source separation; multiple microphones; parameter estimation; speaker separation; statistical characteristics; training utterance; Array signal processing; Blind source separation; Filters; Hidden Markov models; Information filtering; Maximum likelihood estimation; Parameter estimation; Probability distribution; Signal processing; Source separation;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Applications of Signal Processing to Audio and Acoustics, 2003 IEEE Workshop on.
  • Print_ISBN
    0-7803-7850-4
  • Type

    conf

  • DOI
    10.1109/ASPAA.2003.1285797
  • Filename
    1285797