مرکز منطقه ای اطلاع رساني علوم و فناوري - Audio-visual speaker recognition for video broadcast news: some fusion techniques

DocumentCode :

3168190

Title :

Audio-visual speaker recognition for video broadcast news: some fusion techniques

Author :

Maison, Benoit ; Neti, Chalapathy ; Senior, Andrew

Author_Institution :

IBM Thomas J. Watson Res. Center, Yorktown Heights, NY, USA

fYear :

1999

fDate :

1999

Firstpage :

161

Lastpage :

167

Abstract :

Audio-based speaker identification degrades severely when there is a mismatch between training and test conditions either due to channel or noise. In this paper, we explore various techniques to fuse video based speaker identification with audio-based speaker identification to improve the performance under mismatched conditions. Specifically, we explore techniques to optimally determine the relative weights of the independent decisions based on audio and video to achieve the best combination. Experiments on video broadcast news data suggest that significant improvements can be achieved by the combination in acoustically degraded conditions

Keywords :

acoustic signal processing; audio signal processing; speaker recognition; video signal processing; acoustically degraded conditions; audio-based speaker identification; audio-visual speaker recognition; mismatched conditions; relative independent decision weights; video broadcast news; Broadcasting; Degradation; Face detection; Face recognition; Fuses; Loudspeakers; Multimedia communication; Speaker recognition; Telephony; Testing;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Multimedia Signal Processing, 1999 IEEE 3rd Workshop on

Conference_Location :

Copenhagen

Print_ISBN :

0-7803-5610-1

Type :

conf

DOI :

10.1109/MMSP.1999.793814

Filename :

793814

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3168190