مرکز منطقه ای اطلاع رساني علوم و فناوري - Discriminant binary data representation for speaker recognition

DocumentCode :

2178900

Title :

Discriminant binary data representation for speaker recognition

Author :

Bonastre, J.F. ; Bousquet, P.M. ; Matrouf, D. ; Anguera, X.

Author_Institution :

LIA, Univ. of Avignon, Avignon, France

fYear :

2011

fDate :

22-27 May 2011

Firstpage :

5284

Lastpage :

5287

Abstract :

In supervector UBM/GMM paradigm, each acoustic file is represented by the mean parameters of a GMM model. This supervector space is used as a data representation space, which has a high dimensionality. Moreover, this space is not intrinsically discriminant and a complete speech segment is represented by only one vector, withdrawing mainly the possibility to take into account temporal or sequential information. This work proposes a new approach where each acoustic frame is represented in a discriminant binary space. The proposed approach relies on a UBM to structure the acoustic space in regions. Each region is then populated with a set of Gaussian models, denoted as "specificities", able to emphasize speaker specific information. Each acoustic frame is mapped in the discriminant binary space, turning "on" or "off all the specificities to create a large binary vector. All the following steps, speaker reference extraction, likelihood estimation or decision take place in this binary space. Even if this work is a first step in this avenue, the experiments based on NIST SRE 2008 framework demonstrate the potential of the proposed approach. Moreover, this approach opens the opportunity to rethink all the classical processes using a discrete, binary view.

Keywords :

Gaussian processes; speaker recognition; NIST SRE 2008 framework; UBM-GMM paradigm; acoustic space; data representation space; discriminant binary data representation; likelihood estimation; speaker recognition; speaker reference extraction; Acoustics; Computational modeling; Data models; Generators; Speaker recognition; Speech; Training; Discrete; binary; discriminant; speaker recognition;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Acoustics, Speech and Signal Processing (ICASSP), 2011 IEEE International Conference on

Conference_Location :

Prague

ISSN :

1520-6149

Print_ISBN :

978-1-4577-0538-0

Electronic_ISBN :

1520-6149

Type :

conf

DOI :

10.1109/ICASSP.2011.5947550

Filename :

5947550

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2178900