• DocumentCode
    1653252
  • Title

    Spatial and coherence cues based time-frequency masking for binaural reverberant speech separation

  • Author

    Alinaghi, Atiyeh ; Wenwu Wang ; Jackson, Philip J. B.

  • Author_Institution
    Dept. of Electron. Eng. (FEPS), Univ. of Surrey, Guildford, UK
  • fYear
    2013
  • Firstpage
    684
  • Lastpage
    688
  • Abstract
    Most of the binaural source separation algorithms only consider the dissimilarities between the recorded mixtures such as interaural phase and level differences (IPD, ILD) to classify and assign the time-frequency (T-F) regions of the mixture spectrograms to each source. However, in this paper we show that the coherence between the left and right recordings can provide extra information to label the T-F units from the sources. This also reduces the effect of reverberation which contains random reflections from different directions showing low correlation between the sensors. Our algorithm assigns the T-F regions into original sources based on weighted combination of IPD, ILD, the mixing vector models and the estimated interaural coherence (IC) between the left and right recordings. The binaural room impulse responses measured in four rooms with various acoustic conditions have been used to evaluate the performance of the proposed method which shows an average improvement of more than 2.23 dB in signal-to-distortion ratio (SDR) in room D with T60 = 0.89 s over the state-of-the-art algorithms.
  • Keywords
    blind source separation; coherence; reverberation; speech processing; time-frequency analysis; transient response; IC; ILD; IPD; SDR; T-F region; binaural reverberant speech separation; binaural room impulse response; binaural source separation algorithm; interaural coherence; interaural level difference; interaural phase difference; reverberation effect; sensor; signal-to-distortion ratio; spectrogram mixture; time-frequency masking; vector model; Azimuth; Coherence; Integrated circuits; Reverberation; Source separation; Speech; Vectors; Precedence effect; binaural cues; blind source separation;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on
  • Conference_Location
    Vancouver, BC
  • ISSN
    1520-6149
  • Type

    conf

  • DOI
    10.1109/ICASSP.2013.6637735
  • Filename
    6637735