مرکز منطقه ای اطلاع رساني علوم و فناوري - Audio-visual speaker localization via weighted clustering

DocumentCode :

155633

Title :

Audio-visual speaker localization via weighted clustering

Author :

Gebru, Israel D. ; Alameda-Pineda, Xavier ; Horaud, Radu ; Forbes, Florence

Author_Institution :

INRIA Grenoble Rhone-Alpes, Grenoble, France

fYear :

2014

fDate :

21-24 Sept. 2014

Firstpage :

Lastpage :

Abstract :

In this paper we address the problem of detecting and locating speakers using audiovisual data. We address this problem in the framework of clustering. We propose a novel weighted clustering method based on a finite mixture model which explores the idea of non-uniform weighting of observations. Weighted-data clustering techniques have already been proposed, but not in a generative setting as presented here. We introduce a weighted-data mixture model and we formally devise the associated EM procedure. The clustering algorithm is applied to the problem of detecting and localizing a speaker over time using both visual and auditory observations gathered with a single camera and two microphones. Audiovisual fusion is enforced by introducing a cross-modal weighting scheme. We test the robustness of the method with experiments in two challenging scenarios: disambiguate between an active and a non-active speaker, and associate a speech signal with a person.

Keywords :

audio-visual systems; microphones; mixture models; object detection; pattern clustering; signal detection; speaker recognition; EM procedure; audio-visual speaker localization; audiovisual data; auditory observations; clustering algorithm; cross-modal weighting scheme; finite mixture model; microphones; nonuniform weighting; single camera; speakers detection; speech signal; visual observations; weighted-data clustering techniques; weighted-data mixture model; Cameras; Clustering algorithms; Microphones; Robustness; Speech; Standards; Visualization; Mixture models; audiovisual fusion; multimodal signal processing; weighted-data clustering;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Machine Learning for Signal Processing (MLSP), 2014 IEEE International Workshop on

Conference_Location :

Reims

Type :

conf

DOI :

10.1109/MLSP.2014.6958874

Filename :

6958874

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=155633