مرکز منطقه ای اطلاع رساني علوم و فناوري - Multi-view learning of acoustic features for speaker recognition

DocumentCode :

2974911

Title :

Multi-view learning of acoustic features for speaker recognition

Author :

Livescu, Karen ; Stoehr, Mark

Author_Institution :

TTI-Chicago, Chicago, IL, USA

fYear :

2009

fDate :

Nov. 13 2009-Dec. 17 2009

Firstpage :

Lastpage :

Abstract :

We consider learning acoustic feature transformations using an additional view of the data, in this case video of the speaker´s face. Specifically, we consider a scenario in which clean audio and video is available at training time, while at test time only noisy audio is available. We use canonical correlation analysis (CCA) to learn linear projections of the acoustic observations that have maximum correlation with the video frames. We provide an initial demonstration of the approach on a speaker recognition task using data from the VidTIMIT corpus. The projected features, in combination with baseline MFCCs, outperform the baseline recognizer in noisy conditions. The techniques we present are quite general, although here we apply them to the case of a specific speaker recognition task. This is the first work of which we are aware in which multiple views are used to learn an acoustic feature projection at training time, while using only the acoustics at test time.

Keywords :

speaker recognition; acoustic feature projection; acoustic feature transformations; canonical correlation analysis; multi-view learning; multiple views; speaker recognition; Acoustic noise; Acoustic testing; Automatic speech recognition; Feature extraction; Focusing; Linear discriminant analysis; Loudspeakers; Principal component analysis; Speaker recognition; Video recording;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Automatic Speech Recognition & Understanding, 2009. ASRU 2009. IEEE Workshop on

Conference_Location :

Merano

Print_ISBN :

978-1-4244-5478-5

Electronic_ISBN :

978-1-4244-5479-2

Type :

conf

DOI :

10.1109/ASRU.2009.5373462

Filename :

5373462

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2974911