• DocumentCode
    2059706
  • Title

    Acoustic vector sensor based reverberant speech separation with probabilistic time-frequency masking

  • Author

    Xionghu Zhong ; Xiaoyi Chen ; Wenwu Wang ; Alinaghi, Atiyeh ; Premkumar, A.B.

  • Author_Institution
    Sch. of Comput. Eng., Nanyang Technol. Univ., Singapore, Singapore
  • fYear
    2013
  • fDate
    9-13 Sept. 2013
  • Firstpage
    1
  • Lastpage
    5
  • Abstract
    Most existing speech source separation algorithms have been developed for separating sound mixtures acquired by using a conventional microphone array. In contrast, little attention has been paid to the problem of source separation using an acoustic vector sensor (AVS). We propose a new method for the separation of convolutive mixtures by incorporating the intensity vector of the acoustic field, obtained using spatially co-located microphones which carry the direction of arrival (DOA) information. The DOA cues from the intensity vector, together with the frequency bin-wise mixing vector cues, are then used to determine the probability of each time-frequency (T-F) point of the mixture being dominated by a specific source, based on the Gaussian mixture models (GMM), whose parameters are evaluated and refined iteratively using an expectation-maximization (EM) algorithm. Finally, the probability is used to derive the T-F masks for recovering the sources. The proposed method is evaluated in simulated reverberant environments in terms of signal-to-distortion ratio (SDR), giving an average improvement of approximately 1:5 dB as compared with a related T-F mask approach based on a conventional microphone setting.
  • Keywords
    Gaussian processes; acoustic signal processing; direction-of-arrival estimation; expectation-maximisation algorithm; microphone arrays; mixture models; source separation; speech intelligibility; speech recognition; DOA information; Gaussian mixture models; acoustic vector sensor; convolutive mixtures; direction of arrival information; expectation-maximization algorithm; frequency bin-wise mixing vector cues; intensity vector; microphone array; probabilistic time-frequency masking; reverberant speech separation; signal-to-distortion ratio; sound mixtures; spatially colocated microphones; speech source separation algorithms; time-frequency point; Abstracts; Acoustic distortion; Educational institutions; Transforms; Acoustic vector sensor; EM algorithm; acoustic intensity; blind source separation; direction of arrival;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Signal Processing Conference (EUSIPCO), 2013 Proceedings of the 21st European
  • Conference_Location
    Marrakech
  • Type

    conf

  • Filename
    6811680