Title :
Locality preserving speaker clustering
Author :
Chu, Stephen M. ; Tang, Hao ; Huang, Thomas S.
Author_Institution :
T.J. Watson Res. Center, IBM, Yorktown Heights, NY, USA
fDate :
June 28 2009-July 3 2009
Abstract :
In this paper, we propose an efficient speaker clustering approach based on a locality preserving linear projective mapping in the Gaussian mixture model (GMM) mean supervector space. While the GMM mean supervector has turned out to be an effective representation of speakers, its dimensionality is usually very high. The locality preserving projection (LPP) maps the high-dimensional GMM mean supervector space into a lower-dimensional subspace in an unsupervised fashion where the local neighborhood structure of the data points is optimally preserved. Our speaker clustering experiments clearly show that in the reduced-dimensional LPP subspace, traditional clustering techniques such as k-means and hierarchical clustering perform significantly better than they would in the original high-dimensional GMM mean supervector space and in its principal component subspace.
Keywords :
Gaussian processes; pattern clustering; speech processing; Gaussian mixture model; data point; hierarchical clustering technique; k-mean clustering techniques; local neighborhood structure; locality preserving linear projective mapping; locality preserving speaker clustering; lower-dimensional subspace; mean supervector space; Clustering algorithms; Loudspeakers; Mel frequency cepstral coefficient; Pattern recognition; Predictive models; Principal component analysis; Scattering; Speaker recognition; Speech processing; Training data; Gaussian mixture model; Speaker clustering; locality preserving projection; mean supervector; subspace;
Conference_Titel :
Multimedia and Expo, 2009. ICME 2009. IEEE International Conference on
Conference_Location :
New York, NY
Print_ISBN :
978-1-4244-4290-4
Electronic_ISBN :
1945-7871
DOI :
10.1109/ICME.2009.5202542