DocumentCode :
3238241
Title :
Using a Visual Voice Activity Detector to Regularize the Permutations in Blind Separation of Convolutive Speech Mixtures
Author :
Rivet, Bertrand ; Girin, Laurent ; Servière, Christine ; Pham, Dinh-Tuan ; Jutten, Christian
Author_Institution :
Grenoble Image Parole Signal Autom. (GIPSA), Grenoble
fYear :
2007
fDate :
1-4 July 2007
Firstpage :
223
Lastpage :
226
Abstract :
Audio-visual speech source separation consists in mixing visual speech processing techniques (e.g. lip parameters tracking) with source separation methods to improve and/or simplify the extraction of a speech signal from a mixture of acoustic signals. In this paper, we present a new approach to this problem: visual information is used here as a voice activity detector (VAD). Results show that, in the difficult case of realistic convolutive mixtures, the classic problem of the permutation of the output frequency channels can be solved using the visual information with a simpler processing than when using only audio information.
Keywords :
acoustic convolution; blind source separation; speech processing; blind separation; convolutive speech mixtures; source separation; speech signal extraction; visual voice activity detector; Acoustic signal detection; Blind source separation; Coherence; Data mining; Detectors; Frequency; Separation processes; Signal processing; Source separation; Speech processing; Blind source separation; audiovisual speech; convolutive mixtures; visual voice activity detection;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Digital Signal Processing, 2007 15th International Conference on
Conference_Location :
Cardiff
Print_ISBN :
1-4244-0882-2
Electronic_ISBN :
1-4244-0882-2
Type :
conf
DOI :
10.1109/ICDSP.2007.4288559
Filename :
4288559
Link To Document :
بازگشت