DocumentCode
3476138
Title
Blind source separation and visual voice activity detection for target speech extraction
Author
Liu, Qingju ; Wang, Wenwu
Author_Institution
Centre for Vision, Speech & Signal Process., Univ. of Surrey, Guildford, UK
fYear
2011
fDate
27-30 Sept. 2011
Firstpage
457
Lastpage
460
Abstract
Despite being studied extensively, the performance of blind source separation (BSS) is still limited especially for the sensor data collected in adverse environments. Recent studies show that such an issue can be mitigated by incorporating multimodal information into the BSS process. In this paper, we propose a method for the enhancement of the target speech separated by a BSS algorithm from sound mixtures, using visual voice activity detection (VAD) and spectral subtraction. First, a classifier for visual VAD is formed in the off-line training stage, using labelled features extracted from the visual stimuli. Then we use this visual VAD classifier to detect the voice activity of the target speech. Finally we apply a multi-band spectral subtraction algorithm to enhance the BSS-separated speech signal based on the detected voice activity. We have tested our algorithm on the mixtures generated artificially by the mixing filters with different reverberation times, and the results show that our algorithm improves the quality of the separated target signal.
Keywords
blind source separation; reverberation; speech enhancement; BSS; blind source separation; multiband spectral subtraction algorithm; multimodal information; reverberation times; target speech enhancement; target speech extraction; visual VAD classifier; visual voice activity detection; Classification algorithms; Blind source separation; multi-band spectral subtraction; multimodal enhancement; visual voice activity detection;
fLanguage
English
Publisher
ieee
Conference_Titel
Awareness Science and Technology (iCAST), 2011 3rd International Conference on
Conference_Location
Dalian
Print_ISBN
978-1-4577-0887-9
Type
conf
DOI
10.1109/ICAwST.2011.6163194
Filename
6163194
Link To Document