DocumentCode
2279306
Title
Robust speaker clustering in eigenspace
Author
Faltlhauser, R. ; Ruske, G.
Author_Institution
Inst. for Human-Machine-Commun., Technische Univ. Munchen, Munich, Germany
fYear
2001
fDate
2001
Firstpage
57
Lastpage
60
Abstract
We propose a speaker clustering scheme working in ´eigenspace´. Speaker models are transformed to a low-dimensional subspace using ´eigenvoices´. For the speaker clustering procedure, simple distance measures, e.g. Euclidean distance, can be applied. Moreover, clustering can be accomplished with base models (for eigenvoice projection) like Gaussian mixture models as well as conventional HMMs. In case of HMMs, re-projection to the original space readily yields acoustic models. Clustering in subspace produces a well-balanced cluster and is easy to control. In the field of speaker adaptation, several principal techniques can be distinguished. The most prominent among them are Bayesian adaptation (e.g. MAP), transformation based approaches (MLLR - maximum likelihood linear regression), as well as so-called eigenspace techniques. Especially the latter have become increasingly popular, as they make use of a-priori information about the distribution of speaker models. The basic approach is commonly called the eigenvoice (EV) approach. Besides these techniques, speaker clustering is a further attractive adaptation scheme, especially since it can be - and has been - easily combined with the above methods.
Keywords
Bayes methods; Gaussian processes; eigenvalues and eigenfunctions; hidden Markov models; pattern clustering; speech recognition; Bayesian adaptation; Euclidean distance; Gaussian mixture models; HMM; MAP; a-priori information; eigenspace; eigenvoices; maximum likelihood linear regression; speaker adaptation; speaker clustering; speech recognition; Acoustic measurements; Bayesian methods; Decorrelation; Hidden Markov models; Loudspeakers; Maximum likelihood linear regression; Robustness; Spatial databases;
fLanguage
English
Publisher
ieee
Conference_Titel
Automatic Speech Recognition and Understanding, 2001. ASRU '01. IEEE Workshop on
Print_ISBN
0-7803-7343-X
Type
conf
DOI
10.1109/ASRU.2001.1034588
Filename
1034588
Link To Document