DocumentCode :
395199
Title :
Towards domain independent speaker clustering
Author :
Moh, Yvonne ; Nguyen, Patrick ; Junqua, Jean-Claude
Author_Institution :
Comput. Sci. Dept., Rheinisch-Westfalische Tech. Hochschule, Aachen, Germany
Volume :
2
fYear :
2003
fDate :
6-10 April 2003
Abstract :
Speaker clustering is a key component in many speech processing applications. We focus on Broadcast News meta data annotation and speaker adaptation. In this setting, speaker clustering consists of identifying who spoke, and when they spoke in a long news broadcast. Speaker clustering is given a set of short audio segments. Ideally, it will discover how many people are speaking in the broadcast, and when they are speaking. The same problem can be transposed to a different domain. In this paper, we present two techniques that do not require a priori training. The speaker clustering is based on information collected solely on encountered test data. They aim at being portable across domains. The first method is based on a Bayesian information criterion (BIC), with single full-covariance Gaussians. It is fairly primitive but effective. The second method, called speaker triangulation, constructs a coordinate system based on conditional likelihoods of the audio segments. Clusters are located in this coordinate system. We are able to achieve state-of-the-art performance on NIST evaluations across different data sets.
Keywords :
Bayes methods; broadcasting; covariance analysis; information theory; meta data; pattern clustering; speech recognition; Bayesian information criterion; NIST evaluations; broadcast news meta data annotation; conditional likelihoods; coordinate system; data sets; domain independent speaker clustering; full-covariance Gaussians; short audio segments; speaker adaptation; speaker triangulation; speech processing applications; test data; Bayesian methods; Broadcasting; Computer science; Laboratories; Loudspeakers; NIST; Speech processing; Speech recognition; Streaming media; Testing;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03). 2003 IEEE International Conference on
ISSN :
1520-6149
Print_ISBN :
0-7803-7663-3
Type :
conf
DOI :
10.1109/ICASSP.2003.1202300
Filename :
1202300
Link To Document :
بازگشت