Robust speaker clustering in eigenspace

Author

Faltlhauser, R. ; Ruske, G.

Author_Institution

Inst. for Human-Machine-Commun., Technische Univ. Munchen, Munich, Germany

fYear

2001

fDate

2001

Firstpage

57

Lastpage

60

Abstract

We propose a speaker clustering scheme working in ´eigenspace´. Speaker models are transformed to a low-dimensional subspace using ´eigenvoices´. For the speaker clustering procedure, simple distance measures, e.g. Euclidean distance, can be applied. Moreover, clustering can be accomplished with base models (for eigenvoice projection) like Gaussian mixture models as well as conventional HMMs. In case of HMMs, re-projection to the original space readily yields acoustic models. Clustering in subspace produces a well-balanced cluster and is easy to control. In the field of speaker adaptation, several principal techniques can be distinguished. The most prominent among them are Bayesian adaptation (e.g. MAP), transformation based approaches (MLLR - maximum likelihood linear regression), as well as so-called eigenspace techniques. Especially the latter have become increasingly popular, as they make use of a-priori information about the distribution of speaker models. The basic approach is commonly called the eigenvoice (EV) approach. Besides these techniques, speaker clustering is a further attractive adaptation scheme, especially since it can be - and has been - easily combined with the above methods.

Keywords

Bayes methods; Gaussian processes; eigenvalues and eigenfunctions; hidden Markov models; pattern clustering; speech recognition; Bayesian adaptation; Euclidean distance; Gaussian mixture models; HMM; MAP; a-priori information; eigenspace; eigenvoices; maximum likelihood linear regression; speaker adaptation; speaker clustering; speech recognition; Acoustic measurements; Bayesian methods; Decorrelation; Hidden Markov models; Loudspeakers; Maximum likelihood linear regression; Robustness; Spatial databases;

fLanguage

English

Publisher

ieee

Conference_Titel

Automatic Speech Recognition and Understanding, 2001. ASRU '01. IEEE Workshop on

Print_ISBN

0-7803-7343-X

Type

conf

DOI

10.1109/ASRU.2001.1034588

Filename

1034588