• DocumentCode
    2183299
  • Title

    Localization based stereo speech separation using deep networks

  • Author

    Yu, Yang ; Wang, Wenwu ; Luo, Jian ; Feng, Pengming

  • Author_Institution
    School of Marine Science and Technology, Northwestern Polytechnical University, Xi´an, China, 710072
  • fYear
    2015
  • fDate
    21-24 July 2015
  • Firstpage
    153
  • Lastpage
    157
  • Abstract
    Time-frequency (T-F) masking is an effective method for stereo speech source separation. However, reliable estimation of the T-F mask from sound mixtures is a challenging task, especially when room reverberations are present in the mixtures. In this paper, we proposed a new stereo speech separation system where deep networks are used to generate soft T-F mask for separation. More specifically, the deep network, which is composed of two sparse autoencoders and a softmax classifier, is used to estimate the orientations of the target and interferers at each T-F unit, based on low-level features, such as mixing vector (MV), interaural level and phase difference (IPD/ILD). The deep network is trained by a greedy layer-wise method using a dataset that was generated by convolving room impulse responses (RIRs) with clean speech signals positioned in different angles with respect to the sensors. With the trained deep networks, the probability that each T-F unit belongs to the target or interferer can be estimated based on the localization cues for generating the soft mask. Experiments based on real binaural RIRs and TIMIT dataset are provided to show the performance of the proposed system for reverberant speech mixtures, as compared with a model based T-F masking technique proposed recently.
  • Keywords
    Feature extraction; Neural networks; Reverberation; Source separation; Speech; Speech processing; Training; Deep learning; Deep networks; Soft mask; Source separation;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Digital Signal Processing (DSP), 2015 IEEE International Conference on
  • Conference_Location
    Singapore, Singapore
  • Type

    conf

  • DOI
    10.1109/ICDSP.2015.7251849
  • Filename
    7251849