• DocumentCode
    3476138
  • Title

    Blind source separation and visual voice activity detection for target speech extraction

  • Author

    Liu, Qingju ; Wang, Wenwu

  • Author_Institution
    Centre for Vision, Speech & Signal Process., Univ. of Surrey, Guildford, UK
  • fYear
    2011
  • fDate
    27-30 Sept. 2011
  • Firstpage
    457
  • Lastpage
    460
  • Abstract
    Despite being studied extensively, the performance of blind source separation (BSS) is still limited especially for the sensor data collected in adverse environments. Recent studies show that such an issue can be mitigated by incorporating multimodal information into the BSS process. In this paper, we propose a method for the enhancement of the target speech separated by a BSS algorithm from sound mixtures, using visual voice activity detection (VAD) and spectral subtraction. First, a classifier for visual VAD is formed in the off-line training stage, using labelled features extracted from the visual stimuli. Then we use this visual VAD classifier to detect the voice activity of the target speech. Finally we apply a multi-band spectral subtraction algorithm to enhance the BSS-separated speech signal based on the detected voice activity. We have tested our algorithm on the mixtures generated artificially by the mixing filters with different reverberation times, and the results show that our algorithm improves the quality of the separated target signal.
  • Keywords
    blind source separation; reverberation; speech enhancement; BSS; blind source separation; multiband spectral subtraction algorithm; multimodal information; reverberation times; target speech enhancement; target speech extraction; visual VAD classifier; visual voice activity detection; Classification algorithms; Blind source separation; multi-band spectral subtraction; multimodal enhancement; visual voice activity detection;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Awareness Science and Technology (iCAST), 2011 3rd International Conference on
  • Conference_Location
    Dalian
  • Print_ISBN
    978-1-4577-0887-9
  • Type

    conf

  • DOI
    10.1109/ICAwST.2011.6163194
  • Filename
    6163194